Mapping Infrastructure Vulnerability to Flooding in Philadelphia

Environment
Flooding
Python
Author

Richard Barad

Published

December 14, 2023

Amenity and Building Analysis

This analysis looks at the number of amenities and buildings located within the flood hazard area in the city of Philadelphia. Amenity and building data is downloaded from Open Street Maps, and I also determine the planning district that each amenity is located in. I present the results using interactive maps created using the geopandas explore() function and charts created using the altair package.

This analysis is also availble as a PDF report.

Import Packages

I import the packages I will use for this analysis. Osmnx is used to download data from Open Street Maps. Pandas and geopandas are used to manipulate and transform data. Altair and matplotlib are used to create visualizations.

Code
import pandas as pd
import osmnx as ox
from matplotlib import pyplot as plt
import geopandas as gpd
import altair as alt
import pygris

import warnings
warnings.filterwarnings('ignore')

pd.set_option('display.max_columns', None)

Download Amenitiy Data from OSM

I download amenity data from Open Street Maps for all of Philadelphia. I focus on key public amenities that would be problematic to have in a flood hazard area and could be damaged durring a flood. The OSM amenity data is projected into an apporporiate cordinate system. I then count the number of each amenity type located in Philadelphia.

Code
amenities = ox.features_from_place("Philadelphia, PA", tags={"amenity": ["bar","place_of_worship","restaurant",\
                                                                         "school","fast_food","cafe","bank","fuel" \
                                                                        "bar","pub","library","post_office","fire_station",\
                                                                        "community_centre","kindergarten","theatre" \
                                                                        "hospital","police","cinema"]})

buildings = ox.features_from_place("Philadelphia, PA", tags={"building": True})

amenities = amenities.to_crs(6565) 

The chart below was created using matplotlib and shows the total number of each type of amenity in Philadelphia based on OSM data.

Code
fig, ax = plt.subplots(figsize=(5, 2.5))

groups = amenities.groupby('amenity')[['amenity']].count().rename({'amenity':'count'},axis=1)
groups = groups.sort_values('count',ascending=False)

plt.bar(height=groups['count'],x=groups.index)

plt.xticks(rotation = 90)
plt.title('Number of Amenities in Philadelphia based on OSM data')

plt.show()

Get Data on Flood Plains

Next, I get the flood hazard boundary data from the city’s open data portal. The 100 year and 500 year flood plain are included as seperate layers. I perform some cleaning operations to both layers and then concatenate them together using the concat function in pandas.

Code
flood_plain_100_year = gpd.read_file('https://opendata.arcgis.com/datasets/1d6d353ab50b4884b586c05ee2a661db_0.geojson').to_crs(6565)
flood_plain_500_year = gpd.read_file('https://opendata.arcgis.com/datasets/1e6f6315225544c88549478d25cc5181_0.geojson').to_crs(6565)

flood_plain_100_year.loc[flood_plain_100_year['FLOODWAY'] == 'FLOODWAY', ['ZONE']] = 'FLOODWAY'
flood_plain_100_year = flood_plain_100_year[['geometry','ZONE']].fillna('100 YEAR')

flood_plain_500_year['ZONE'] = '500 YEAR'
flood_plain_500_year = flood_plain_500_year[['geometry','ZONE']]

flood_plain = pd.concat([flood_plain_100_year,flood_plain_500_year])

The interactive map below shows where the flood hazard areas are located in Philadelphia.

Code
flood_plain.explore(column='ZONE',tiles='CartoDB positron',cmap='winter',style_kwds ={'stroke':False})
Make this Notebook Trusted to load map: File -> Trust Notebook

Identify Amenities in the Flood Zone

Next, I identify the public amenities located in the flood hazard zone using a spatial join.

Code
amenities['geometry'] = amenities.geometry.centroid
amenities_join = amenities.sjoin(flood_plain,how='left',predicate='within').reset_index()

amenities_join = amenities_join[['amenity','geometry','name','ZONE']].fillna('Not in Floodplain')
amenities_join.replace({"FLOODWAY":"100 YEAR"},inplace=True)

Data Visualization

The chart below shows the percentage of each amenity located in the 100 year flood zone, the 500 year flood zone, and amenities located outside of the flood plain. Pubs, fire stations, and fast food restaurants have the highest percentage of points located in the flood hazard zone. The high percentage of fire stations located in the flood zone is concerning, due to the imortant role this feature plays in emergancy response.

Code
domain = ['100 YEAR', '500 YEAR', 'Not in Floodplain']
range1 = ['blue','lightblue','lightgrey']

alt.Chart(amenities_join).mark_bar().encode(
    alt.X('amenity').title('Amenity Type'),
    alt.Y('count(amenity)').title('Percent of Total').stack("normalize"),
    alt.Color('ZONE').scale(domain=domain, range=range1),
    tooltip=['amenity','count(amenity)','ZONE']
).properties(
    width=400,
    height=300,
    title='Percent of Citywide Amenities Located in Areas Vulnerable to Flooding'
)

Next, I create a copy of the amenities dataframe which just includes amenities located in the flood hazard zone.

Code
filt = amenities_join['ZONE'] != 'Not in Floodplain'
amenities_flood = amenities_join.loc[filt]

The interactive map below shows the location of amenities which are located in the flood hazard zone. The amenities are colored according to the amenity type. Hover over a point to view more information about the amenity.

Code
amenities_flood.explore(column='amenity',tiles='CartoDB positron')
Make this Notebook Trusted to load map: File -> Trust Notebook

The chart below shows the number of amenities by amenity type located in the flood hazard area. The color allows for distinguishing between amenities in the 100 year and 500 year flood hazard zones. There are a large number of restaurants and fast food establishments located in flood hazard areas.

Code
domain1 = ['100 YEAR','500 YEAR']
range2 = ['blue','lightblue']

alt.Chart(amenities_flood).mark_bar().encode(
    alt.X('amenity').title('Amenity Type'),
    alt.Y('count(amenity)').title('Total'),
    alt.Color('ZONE').scale(domain=domain1, range=range2),
    tooltip=['amenity','count(amenity)','ZONE']
).properties(
    width=400,
    height=300,
    title='Number of Amenities Located in Areas Vulnerable to Flooding'
)

Analysis By Planning District

Next we analyze the number of amenities located in the flood zone by planning district. The planning district dataset is imported using an API link obtained through the Philadelphia Open Data Portal. After importing the planning districts, a spatial join is carried out to join the flood vulnerable amenities to the planning district the amenity is located in.

Code
planning_districts = gpd.read_file('https://opendata.arcgis.com/datasets/0960ea0f38f44146bb562f2b212075aa_0.geojson').to_crs(6565)

amenities_flood = amenities_flood.sjoin(planning_districts[['DIST_NAME','geometry']],how='inner')

The charts below visualize the results of the analysis. The visual is interactive and includes cross filtering between two different charts. Click on a bar in the left chart and the chart on the right will update to just show amenity data for the selected planning district. The chart highlights that the largest number of amenities that are vulnerable to flooding are located within the Lower Southwest planning district.

Code
selection = alt.selection_point(fields=['DIST_NAME'])

lines = alt.Chart().mark_bar().encode(
    alt.X('count(amenity)',title = 'Number of Amenities',axis=alt.Axis(format='0f')),
    alt.Y('amenity').title('Amenity Type'),
    color='ZONE',
    tooltip=["ZONE", "amenity", "count(amenity)"]
).properties(
    width=400,
    height=300,
    title='Amenities located within the flood hazard zone'
).transform_filter(
    selection
)

bars = alt.Chart().mark_bar().encode(
    alt.Y('count(amenity):Q').title('Number of Amenities'),
    alt.X('DIST_NAME:N').title('Planning District'),
    opacity=alt.condition(selection,'1',alt.value(0.2),legend=None)
    ).add_params(selection).properties(
    width=150,
    height=230,
    title={
        "text":"Amenities in Flood Hazard Zone",
        "subtitle": "Click on a bar to fiter right chart",
    }
)

chart = alt.hconcat(bars,lines, data=amenities_flood)
chart

Building Analysis

I also carry out an analysis to determine the number of buildings that are located within the flood hazard zone. The building data is downloaded from open street maps using osmnx. The building data from OSM includes both polygons and points - in order to ensure that all the features in my buildings dataset are points I take the centroid of the building features.

I then carry out a spatial join between the buildings and the boundaries of the flood hazard zone and create a new dataframe that only includes buildings whoose centroid is located in the flood hazard zone. I also use additional spatial joins to determine what planning district each building that is vulnerable to flooding is located in and the census block each building is located in.

Code
buildings = buildings.to_crs(6565).reset_index()
buildings['geometry'] = buildings.geometry.centroid

philly_2020_block_groups = pygris.block_groups(state='42', county='101', year=2020).to_crs(6565)

buildings_f = buildings.sjoin(flood_plain,how='inner',predicate='within')
buildings_f.drop('index_right',axis=1,inplace=True)
buildings_f = buildings_f.sjoin(planning_districts[['DIST_NAME','geometry']],how='inner',predicate='within')
buildings_f.drop('index_right',axis=1,inplace=True)
buildings_f = buildings_f.sjoin(philly_2020_block_groups[['GEOID','geometry']],how='inner',predicate='within')

Analysis by Planning District and Census Block Group

The code below counts the number of buildings in each planning district that are located within the flood hazard zone.

Code
pd_count = buildings_f.groupby('DIST_NAME')[['osmid']].count()
pd_count.rename({'osmid':'count'},inplace=True,axis=1)
planning_districts = planning_districts[['DIST_NAME','geometry']].merge(pd_count,on='DIST_NAME',how='left')

The map below is created using matplotlib and shows the number of buildings in the flood zone by planning district. The planning district with the larger number of buildings in the flood zone is the Southwest planning zone.

Code
fig, ax = plt.subplots()

planning_districts.plot(ax=ax,column="count",legend=True,cmap='OrRd',edgecolor='black')

ax.set_axis_off()
ax.set_aspect("equal")

plt.title('# of Buildings in the Flood Hazard Zone by Planning District')
plt.show()

The map below is created using matplotlib and shows the number of buildings in the flood zone by census block group.

Code
pd_count2 = buildings_f.groupby('GEOID')[['osmid']].count().reset_index()
pd_count2.rename({'osmid':'count'},inplace=True,axis=1)

philly_2020_block_groups = philly_2020_block_groups[['GEOID','geometry']].merge(pd_count2,on='GEOID',how='left')

fig, ax = plt.subplots()

philly_2020_block_groups.plot(ax=ax,facecolor='#f7f7f7')

philly_2020_block_groups.plot(ax=ax,column="count",legend=True,cmap='OrRd',edgecolor='#d1d1d1')

ax.set_axis_off()
ax.set_aspect("equal")

plt.title('# of Buildings in the Flood Hazard Zone by Census Block Group')
plt.show()