Exploring Environmental Justice in the Delaware River Basin using K-means Cluster Analysis

R
Environment
Author

Richard Barad

Published

May 15, 2024

Overview

This analysis was completed by Richard Barad as a capstone project for MUSA - 8020 Capstone Project/Advanced Topics In GIS. The project aims to explore how cluster analysis techniques, specifically k-means cluster analysis can be used to map and identify environmental justice areas, and group together census tracts that face similar environmental and socioeconomic vulnerabilities and risks.

Prior to carrying out my analysis I conducted an literature review of existing environmental justice mapping, and provided an over view of the weaknesses and limitations of existing tools and potential areas for improvement. Based on the identified gap, my research aimed to achieve two main goals:

  1. Explore how cluster analysis techniques, specifically k-means cluster analysis can be used for environmental justice mapping. Cluster analysis identifies groups of data that share similar characteristics. In my use case, I aim to identify census tracts that share similar health, environmental, and socioeconomic vulnerabilities. The results of the cluster analysis should also be able to help identify clusters that experience similar compounding hazards and vulnerabilities.

  2. Carryout analysis at a regional scale, using a study area that transcends state boundaries. The results of the cluster analysis, should allow decision makers to make comparisons across a larger region and support regional decision making across sate boundaries.

The study area is the Delaware River Basin, specifically the sections of the Basin located in Delaware, New Jersey, and Pennsylvania. The study area is shown in orange in the figure below. A natural boundary like a watershed is chosen as the study area because environmental pollutants do not respect political boundaries. Additionally, environmental planners are often interested in being able to analyze and make comparisons across a natural boundary such as a watershed. A cross-state analysis also allows the results to be more useful for regional level planning and decision making.

Code
#| output: false
states1 <- states() %>%
  filter(STUSPS %in% c('PA','NJ','DE','NY','MD')) %>% st_transform(proj)
Retrieving data for the year 2021
Code
states2 <- states1 %>% filter(STUSPS %in% c('PA','NJ','DE'))

de_river_basin <- st_read('https://services8.arcgis.com/5Wj4rmM3lycu9Zo6/arcgis/rest/services/DRB_SAs/FeatureServer/0/query?f=geojson&where=1=1') %>%
  st_transform(proj)
Reading layer `OGRGeoJSON' from data source 
  `https://services8.arcgis.com/5Wj4rmM3lycu9Zo6/arcgis/rest/services/DRB_SAs/FeatureServer/0/query?f=geojson&where=1=1' 
  using driver `GeoJSON'
Simple feature collection with 1 feature and 1 field
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -76.39549 ymin: 38.69047 xmax: -74.3576 ymax: 42.46248
Geodetic CRS:  WGS 84
Code
#Intersect study DE, PA, and DE with Delaware River Basin to create study Area
study_area <- st_intersection(de_river_basin,st_union(states2))
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
Code
ggplot()+
  geom_sf(data=study_area,color='transparent',fill='orange')+
  geom_sf(data=de_river_basin,color='lightblue', fill='transparent',linewidth=1.5)+
  geom_sf(data=states2,color='black',fill='transparent')+
  theme_void()

The full published capstone is available here.

I also created a public facing interactive dashboard deliverable which is visible below.