Analyzing the Historical Voter Turnout for Primary Elections in Philadelphia
Politics
Python
Author
Richard Barad
Published
December 14, 2023
Overview
This analysis was completed as part of a my python course at the University of Pennslyvania. The assignment required us to explore a dataset from the Open Philadelphia data portal and produce charts using matplotlib, altair, and seaborn.
Prior to making the charts, I had to clean the data and merge together data from multiple different election cycles. In this homework assignment I also discuss some of the advantages and disadvantages of Altair, Matplotlib, and seaborn.
Import Libraries and Read Data
Code
import pandas as pdfrom matplotlib import pyplot as pltimport altair as altimport osimport seaborn as snspath ='.//Data'files = os.listdir(path)df_list =[]for f in files: df_list.append(pd.read_csv('.//Data//'+ f))vote_data = pd.concat(df_list)
Clean Data
Code
#Shorten election field to contain just the four digit year to make visuals cleanervote_data['election'] = vote_data['election'].str[:5]
Code
#Group all voters who are not democrat or Republican into an Other group and aggregate other group togethermajor_party = vote_data['political_party'].isin(['DEMOCRATIC','REPUBLICAN'])vote_data.loc[~major_party,'political_party'] ='OTHER'vote_data = vote_data.groupby(['precinct_code','precinct_description','election','political_party'],as_index=False)['voter_count'].sum()vote_data.head()
precinct_code
precinct_description
election
political_party
voter_count
0
101
PHILA WD 01 DIV 01
2015
DEMOCRATIC
157
1
101
PHILA WD 01 DIV 01
2015
OTHER
7
2
101
PHILA WD 01 DIV 01
2015
REPUBLICAN
3
3
101
PHILA WD 01 DIV 01
2016
DEMOCRATIC
207
4
101
PHILA WD 01 DIV 01
2016
OTHER
13
Matplotlib - Total Voter Turnout in Primary Elections in Philadelphia
I used matplotlib for this graph because matplotlib seems best suited for creating simple visulizations like bar graphs. Creating a bar graph using Matplotlib requires minimal code. For more complex visulizations like scatter plots, stacked bar graphs, and heatmaps another library would allow us to make the visulizations with less code.
Code
#Calculate total turnout per yearturnout = vote_data.groupby('election',as_index=False)['voter_count'].sum()from matplotlib import tickerfig, ax = plt.subplots(figsize=(5, 3))def addlabels(x,y):for i inrange(len(x)): plt.text(i,y[i]-29000,y[i],ha ='center')ax.bar(turnout['election'],turnout['voter_count'],width=0.6,color='#7a9dff')ax.set_ylabel('Voter Turnout')ax.set_xlabel('Election')ax.yaxis.set_major_formatter(ticker.StrMethodFormatter("{x:,.0f}"))ax.yaxis.grid(visible ='True',linewidth=0.5,color='#e0e0e0')ax.set_axisbelow(True)addlabels(turnout['election'],turnout['voter_count'])plt.title('Voter Turnout in Primary Elections in Philadelphia from 2015-2018')plt.show()
The matplotlib graph shows total voter turnout in each primary election from 2015-2018. From the graph we can observe that primary turnout is highest in 2016 which was the year of a presidential election. The second highest primary turnout was in 2015 which was the year of a mayoral election.
Make Boxplot Chart using Seaborn
This chart is a boxplot which shows the distribution of the number of voters by precinct from 2015 to 2018 in primary elections. Boxplots are included for both Democrats and Republicans primaries. The middle line in the box represents the median value, the top of the box represents the upper quartile (Q3), while the bottom of the box represents the lower quartile (Q1) - in other words 50% of the data points are located within the box. The whiskers extending from the box extend to the minimum and maximum values for the number of voters. I used seaborn for this visulization because Seaborn library is designed for creating statistical diagrams like boxplots.
Code
# Filter to just registered Democratic and Republican Votersmajor_party = vote_data['political_party'].isin(['DEMOCRATIC','REPUBLICAN'])vote_data_major_party = vote_data.loc[major_party]fig, ax = plt.subplots(figsize=(5, 4))sns.set_theme(style="ticks", palette="pastel")# Draw a nested boxplot to show bills by day and timebox_plot = sns.boxplot(data=vote_data_major_party, x="election", y="voter_count", hue="political_party", whis=(0,100), width=0.5, linewidth=0.7)ax.set_ylabel('Voter Turnout by Precicinct')ax.set_xlabel('Election')plt.title('Boxplot of Primary Voter Turnout By Precinct from 2015 - 2018 in Philadelphia')plt.show()
The boxplots show that this a wide variation in the number of voters voting in each precinct. This could be because the number of people living in each precinct is variable or it could also be caused by variable turnout across the precincts. Across all years, the number of voters voting in Democratic pimary elections in Philadelphia is much higher than the number of voters voting in Republican elections. The median voter turnout was highest in the 2016 primary for both Democrats and Republicans. For Democrats, the value for the first/lower quartile (Q1), is higher than the value of the third/upper quartile (Q3) in 2017 and 2018 indicating a large jump in voter turnout in 2016 durring the presidential election year.
Altair Visualizations
Altair Chart 1 - Percent of Total Primary Voters by Political Party
Code
alt.data_transformers.disable_max_rows()source = vote_dataalt.Chart(source).transform_aggregate( total1='sum(voter_count)', groupby=['election', 'political_party']).transform_joinaggregate( total2='sum(total1)', groupby=['election'] ).transform_calculate( frac=alt.datum.total1 / alt.datum.total2).mark_bar().encode( x=alt.X('total1:Q').stack("normalize").title('Percent of Total Voters'), y='election', color='political_party', tooltip=[alt.Tooltip('political_party', title='Poltical Party'), alt.Tooltip('total1:Q', title='Number of Voters', format=',.0f'), alt.Tooltip('frac:Q', title='Percent of Voters', format='.0%') ]).properties(height=alt.Step(30),title='Percent of total Voters voting in each primary election')
This chart shows the percent of total voters voting in democratic, republican, and other primaries from 2015 to 2018. The charts indicates that from 2015-2018 between 90 and 85% of voters in Philadelphia voted in the democratic primary. Between 7 and 12% of primary voters voted in the republican primary. The percent of primary voters who voted in the republican primary is highest in 2016 when 12 percent of voteres voted in the Republican primary. Based on previous graphs, we know that both democratic and republican turnout increased in 2016. This graph shows us that the percent increase in turnout in 2016 was higher for the Republican party. We know this becase the percentage of total voters who are Democrats declined in 2016 despite the increase in democratic turnout.
Altair Chart 2 - Total Voter Turnout in Primary Elections
Code
alt.data_transformers.disable_max_rows()source = vote_dataalt.Chart(source).mark_bar().encode( x='election', y=alt.Y('sum(voter_count)').title('Number of Voters'), color='political_party', tooltip=[alt.Tooltip('political_party', title='Poltical Party'), alt.Tooltip('sum(voter_count)', title='Number of Voters', format=',.0f') ]).properties(width=alt.Step(90),title='Voter Turnout from 2015-2018 in Primary Elections')
This chart show voter turnout by party from 2015 to 2018 in primary elections. For both of the major parties, turnout was highest in 2016 which was the year of the presidential election. The second highest turnout year was 2015, which was the year of a mayoral election. Total turnout was similar for the 2017 and 2018 primaries. However, the democratic party had higher turnout in 2017 than 2018. Conversely, republican turnout was higher in 2018 than 2017.
Altair Chart 3 - Percent of Voters Voting as Democrats and Republicans by Ward
Data Processing
Code
# Calculate Number of Voters by Ward for each election / political party combination.vote_data['Ward'] = vote_data['precinct_description'].str[6:11]vote_data_ward = vote_data.groupby(['Ward','election','political_party'],as_index=False)['voter_count'].sum()# Calculate Percent of Residents in each ward voting for each party by electionvote_data_total = vote_data_ward.groupby(['Ward','election'],as_index=False)['voter_count'].sum()vote_data_total.rename({'voter_count':'total_votes'},axis=1,inplace=True)vote_w_total = vote_data_ward.merge(vote_data_total,on=['Ward','election'])vote_w_total['pct'] = (vote_w_total['voter_count'] / vote_w_total['total_votes'])#Pivot Datavote_w_total_pivot = vote_w_total.pivot(index=['Ward','election','total_votes'],columns='political_party',values='pct').reset_index()vote_w_total_pivot.head()
political_party
Ward
election
total_votes
DEMOCRATIC
OTHER
REPUBLICAN
0
WD 01
2015
3848
0.901767
0.041580
0.056653
1
WD 01
2016
5661
0.869988
0.036566
0.093446
2
WD 01
2017
3358
0.911257
0.040203
0.048541
3
WD 01
2018
3704
0.904698
0.029428
0.065875
4
WD 02
2015
5540
0.901444
0.041516
0.057040
Make Chart
Code
brush = alt.selection_interval()( alt.Chart(vote_w_total_pivot).transform_calculate( x='datum.DEMOCRATIC * 100', y='datum.REPUBLICAN * 100').mark_point() .encode( x=alt.X("x:Q", scale=alt.Scale(zero=True),title='Percent Voting as Democrat'), y=alt.Y("y:Q", scale=alt.Scale(zero=True,domainMax=50),title='Percent Voting in Republican'), color=alt.condition(brush, "election:N", alt.value("lightgray")), tooltip=[alt.Tooltip('Ward',title='Ward'), alt.Tooltip('total_votes',title='Total Turnout'), alt.Tooltip('DEMOCRATIC', title="% of Voters Voting in Democratic Primary",format='.2%'), alt.Tooltip('REPUBLICAN', title="% of Voters Voting in Republican Primary",format='.2%')] ) .properties(width=200, height=200) .facet(column="election") .add_params(brush))
These charts show the percent of voters by Ward voting in the democratic primary and the republican primary. Across all wards in Philadelphia more than 50 percent of voters voted in the Democratic primary. This pattern holds true across all four of the analyzed election years. In most wards, more than 80 percent of voters voted in the democratic primary. There are several wards were the percentage of voters who vote in republican primaries is consistently high compared to the rest of the city - this includes Wards 66, 64, 63 and 45. In all four of the wards listed the percentage of voters voting in the republican primary is greater than 25 percent in all four of the analyzed election years.
Altair Chart Four - Two Chart Dashboard
This altair dashboard includes a heatmap and a scatter plot. To use the dashboard, click on any box in the heat map. All points in the scatter plot will turn grey excepte for the selected point. You can hold down shift on your keyboard to select multiple points. For example, a user might want to select all data points for one ward to see what percent of Voters Voted in the Democratic Primary in the ward they level in. For example a user could select the four data points for Ward 66 in the heat map and see that Ward 66 has the lowest percentage of voters voting in democratic primary across the city.
Code
selection = alt.selection_point()points = ( alt.Chart().transform_calculate( x='datum.DEMOCRATIC * 100') .mark_point() .encode( x=alt.X("total_votes:Q", scale=alt.Scale(zero=True),title='Number of Voters'), y=alt.Y("x:Q", scale=alt.Scale(zero=True), title='Percent of Voters Voting in Democratic Primary'), color=alt.condition(selection,"election:N",alt.value("lightgray")), tooltip=[alt.Tooltip('Ward',title='Ward'), alt.Tooltip('election',title='Election Year'), alt.Tooltip('total_votes',title='Total Turnout'), alt.Tooltip('DEMOCRATIC', title="% of Voters Voting in Democratic Primary",format='.0%'), alt.Tooltip('REPUBLICAN', title="% of Voters Voting in Republican Primary",format='.0%')] ) .properties(width=650, height=400))heatmap = ( alt.Chart() .mark_rect() .encode( x='Ward:O', y='election:O', color='total_votes:Q', opacity=alt.condition(selection,'1',alt.value(0.2)) ) .add_params(selection) .properties(width=650))chart = alt.vconcat(points, heatmap, data=vote_w_total_pivot)chart