Pedestrian Fatalities in NYS

2019-2023

Author

Andrea Harder

Published

December 14, 2024

Introduction

Everyone deserves to feel safe in our streets whether they choose to walk, bike, bus, or drive, but the reality is that pedestrian safety has declined significantly over the past several decades in the United States. According to the U.S. Department of Transportation, 7,388 pedestrians were killed in motor vehicle crashes in 2021, which represents a 12.5 percent increase from 2020 and a 40-year high (U.S. DOT, 2023). New York State (NYS) is home to roughly 20 million people, and it is estimated that approximately 300 pedestrians die in motor vehicle crashes each year (NYS Department of Health, 2024). In 2016, the state launched a 5-year pedestrian safety campaign that would invest up to $110 million in engineering, enforcement, and education-related pedestrian safety projects (NYS, n.d.). Was the campaign effective in reducing pedestrian fatalities? How have trends changed in recent years and what are some opportunities for improving pedestrian safety moving forward? This project explores some of these questions to understand trends, common characteristics, and the geospatial distribution of fatal accidents involving pedestrians in NYS from 2019 to 2023.

Research Questions

  1. How have trends in pedestrian fatalities in NYS changed in over time?
  2. What is the geospatial distribution of crashes involving pedestrians across NYS?
  3. What are the most common causes of pedestrian fatalities in NYS?

Materials and Methods

Crashes involving injury, death, or include damages exceeding $1,000 must be reported to the Department of Motor Vehicles (DMV) in compliance with NYS Vehicle and Traffic Laws. The DMV has made some of this data available to the public through the provision of the following four datasets. These datasets can be analyzed to understand common trends and characteristics of motor vehicle crashes in NYS:

  1. Individual Information
  2. Crash Case Information
  3. Vehicle Information
  4. Violation Information

Only two of these datasets contain useful information relating to fatal accidents involving pedestrians.

The Individual Information dataset can be filtered directly to identify pedestrian fatalities. Each case reports the year of the accident and the age of the victim. Other variables are included in this dataset but are not explored in this analysis. The Case Information dataset has multiple variables that can help explain the geospatial distribution of crashes in NYS. These variables include municipality, county, and reference marker location information. Other variables, including the day of the week, lighting conditions, weather conditions, and details about what the pedestrian was doing at the time of the accident, can similarly be explored to better understand common causes and characteristics of fatalities. However, this dataset can only be filtered to identify fatal accidents involving pedestrians and there is no guarantee that the pedestrian is the one that died in the motor vehicle crash. This is a limitation of using this dataset, although a minor one, because, realistically speaking, a motorist has a better chance of walking away from a fatal collision than a pedestrian. Therefore, this dataset remains useful for exploring the research questions.

The datasets and shapefiles can be accessed using the following hyperlinks:

Individual Information

Case Information

Civil Boundaries

Reference Marker Locations

Load required packages, datasets, and shapefiles

Show the code
if(!requireNamespace("tidyverse", quietly = TRUE)){
  install.packages("tidyverse")
}
library(tidyverse)

if(!requireNamespace("dplyr", quietly = TRUE)){
  install.packages("dplyr")
}
library(dplyr)

if(!requireNamespace("ggplot2", quietly = TRUE)){
  install.packages("ggplot2")
}
library(ggplot2)

if(!requireNamespace("sf", quietly = TRUE)){
  install.packages("sf")
}
library(sf)

if(!requireNamespace("knitr", quietly = TRUE)){
  install.packages("knitr")
}
library(knitr)

if(!requireNamespace("leaflet", quietly = TRUE)){
  install.packages("leaflet")
}
library(leaflet)

if(!requireNamespace("tidycensus", quietly = TRUE)){
  install.packages("tidycensus")
}
library(tidycensus)

if(!requireNamespace("plotly", quietly = TRUE)){
  install.packages("plotly")
} 
# Downloading packages -------------------------------------------------------
- Downloading plotly from CRAN ...              OK [3 Mb in 0.29s]
- Downloading promises from CRAN ...            OK [1.6 Mb in 0.25s]
- Downloading later from CRAN ...               OK [151.9 Kb in 0.25s]
Successfully downloaded 3 packages in 2.1 seconds.

The following package(s) will be installed:
- later    [1.4.1]
- plotly   [4.10.4]
- promises [1.3.2]
These packages will be installed into "~/work/final-project-andreaharder/final-project-andreaharder/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing later ...                          OK [installed binary and cached in 0.29s]
- Installing promises ...                       OK [installed binary and cached in 0.32s]
- Installing plotly ...                         OK [installed binary and cached in 0.86s]
Successfully installed 3 packages in 1.6 seconds.
Show the code
library(plotly)

if(!requireNamespace("kableExtra", quietly = TRUE)){ 
  install.packages("kableExtra")
} 
library(kableExtra)

if(!requireNamespace("leaflegend", quietly = TRUE)){ 
  install.packages("leaflegend")
} 
# Downloading packages -------------------------------------------------------
- Downloading leaflegend from CRAN ...          OK [1.4 Mb in 0.38s]
Successfully downloaded 1 package in 0.73 seconds.

The following package(s) will be installed:
- leaflegend [1.2.1]
These packages will be installed into "~/work/final-project-andreaharder/final-project-andreaharder/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing leaflegend ...                     OK [installed binary and cached in 0.26s]
Successfully installed 1 package in 0.31 seconds.
Show the code
library(leaflegend)

#Motor vehicle crashes case information: Filtered by year, crash descriptor (i.e. fatal accident), and event descriptor (i.e. pedestrian collision with)
data2019ped <- read.csv("https://data.ny.gov/resource/e8ky-4vqe.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60accident_descriptor%60%2C%0A%20%20%60time%60%2C%0A%20%20%60date%60%2C%0A%20%20%60day_of_week%60%2C%0A%20%20%60police_report%60%2C%0A%20%20%60lighting_conditions%60%2C%0A%20%20%60municipality%60%2C%0A%20%20%60collision_type_descriptor%60%2C%0A%20%20%60county_name%60%2C%0A%20%20%60road_descriptor%60%2C%0A%20%20%60weather_conditions%60%2C%0A%20%20%60traffic_control_device%60%2C%0A%20%20%60road_surface_conditions%60%2C%0A%20%20%60dot_reference_marker_location%60%2C%0A%20%20%60pedestrian_bicyclist_action%60%2C%0A%20%20%60event_descriptor%60%2C%0A%20%20%60number_of_vehicles_involved%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222019%22))%0A%20%20AND%20(caseless_one_of(%60accident_descriptor%60%2C%20%22Fatal%20Accident%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%0A%20%20%20%20%20%20%20%20%20%20%20%60event_descriptor%60%2C%0A%20%20%20%20%20%20%20%20%20%20%20%22Pedestrian%2C%20Collision%20With%22%0A%20%20%20%20%20%20%20%20%20))")

data2020ped <- read.csv("https://data.ny.gov/resource/e8ky-4vqe.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60accident_descriptor%60%2C%0A%20%20%60time%60%2C%0A%20%20%60date%60%2C%0A%20%20%60day_of_week%60%2C%0A%20%20%60police_report%60%2C%0A%20%20%60lighting_conditions%60%2C%0A%20%20%60municipality%60%2C%0A%20%20%60collision_type_descriptor%60%2C%0A%20%20%60county_name%60%2C%0A%20%20%60road_descriptor%60%2C%0A%20%20%60weather_conditions%60%2C%0A%20%20%60traffic_control_device%60%2C%0A%20%20%60road_surface_conditions%60%2C%0A%20%20%60dot_reference_marker_location%60%2C%0A%20%20%60pedestrian_bicyclist_action%60%2C%0A%20%20%60event_descriptor%60%2C%0A%20%20%60number_of_vehicles_involved%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222020%22))%0A%20%20AND%20(caseless_one_of(%60accident_descriptor%60%2C%20%22Fatal%20Accident%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%0A%20%20%20%20%20%20%20%20%20%20%20%60event_descriptor%60%2C%0A%20%20%20%20%20%20%20%20%20%20%20%22Pedestrian%2C%20Collision%20With%22%0A%20%20%20%20%20%20%20%20%20))")

data2021ped <- read.csv("https://data.ny.gov/resource/e8ky-4vqe.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60accident_descriptor%60%2C%0A%20%20%60time%60%2C%0A%20%20%60date%60%2C%0A%20%20%60day_of_week%60%2C%0A%20%20%60police_report%60%2C%0A%20%20%60lighting_conditions%60%2C%0A%20%20%60municipality%60%2C%0A%20%20%60collision_type_descriptor%60%2C%0A%20%20%60county_name%60%2C%0A%20%20%60road_descriptor%60%2C%0A%20%20%60weather_conditions%60%2C%0A%20%20%60traffic_control_device%60%2C%0A%20%20%60road_surface_conditions%60%2C%0A%20%20%60dot_reference_marker_location%60%2C%0A%20%20%60pedestrian_bicyclist_action%60%2C%0A%20%20%60event_descriptor%60%2C%0A%20%20%60number_of_vehicles_involved%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222021%22))%0A%20%20AND%20(caseless_one_of(%60accident_descriptor%60%2C%20%22Fatal%20Accident%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%0A%20%20%20%20%20%20%20%20%20%20%20%60event_descriptor%60%2C%0A%20%20%20%20%20%20%20%20%20%20%20%22Pedestrian%2C%20Collision%20With%22%0A%20%20%20%20%20%20%20%20%20))")

data2022ped <- read.csv("https://data.ny.gov/resource/e8ky-4vqe.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60accident_descriptor%60%2C%0A%20%20%60time%60%2C%0A%20%20%60date%60%2C%0A%20%20%60day_of_week%60%2C%0A%20%20%60police_report%60%2C%0A%20%20%60lighting_conditions%60%2C%0A%20%20%60municipality%60%2C%0A%20%20%60collision_type_descriptor%60%2C%0A%20%20%60county_name%60%2C%0A%20%20%60road_descriptor%60%2C%0A%20%20%60weather_conditions%60%2C%0A%20%20%60traffic_control_device%60%2C%0A%20%20%60road_surface_conditions%60%2C%0A%20%20%60dot_reference_marker_location%60%2C%0A%20%20%60pedestrian_bicyclist_action%60%2C%0A%20%20%60event_descriptor%60%2C%0A%20%20%60number_of_vehicles_involved%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222022%22))%0A%20%20AND%20(caseless_one_of(%60accident_descriptor%60%2C%20%22Fatal%20Accident%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%0A%20%20%20%20%20%20%20%20%20%20%20%60event_descriptor%60%2C%0A%20%20%20%20%20%20%20%20%20%20%20%22Pedestrian%2C%20Collision%20With%22%0A%20%20%20%20%20%20%20%20%20))")

data2023ped <- read.csv("https://data.ny.gov/resource/e8ky-4vqe.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60accident_descriptor%60%2C%0A%20%20%60time%60%2C%0A%20%20%60date%60%2C%0A%20%20%60day_of_week%60%2C%0A%20%20%60police_report%60%2C%0A%20%20%60lighting_conditions%60%2C%0A%20%20%60municipality%60%2C%0A%20%20%60collision_type_descriptor%60%2C%0A%20%20%60county_name%60%2C%0A%20%20%60road_descriptor%60%2C%0A%20%20%60weather_conditions%60%2C%0A%20%20%60traffic_control_device%60%2C%0A%20%20%60road_surface_conditions%60%2C%0A%20%20%60dot_reference_marker_location%60%2C%0A%20%20%60pedestrian_bicyclist_action%60%2C%0A%20%20%60event_descriptor%60%2C%0A%20%20%60number_of_vehicles_involved%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222023%22))%0A%20%20AND%20(caseless_one_of(%60accident_descriptor%60%2C%20%22Fatal%20Accident%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%0A%20%20%20%20%20%20%20%20%20%20%20%60event_descriptor%60%2C%0A%20%20%20%20%20%20%20%20%20%20%20%22Pedestrian%2C%20Collision%20With%22%0A%20%20%20%20%20%20%20%20%20))")

#Combine the datasets
CaseInfo <- rbind(data2019ped,data2020ped,data2021ped,data2022ped,data2023ped)
names(CaseInfo)[15] <- "PANEL"

#Motor vehicle crashes individual information: Filtered by year, role type (i.e. pedestrian), and injury severeity (i.e. killed)

Ind2019ped <- read.csv("https://data.ny.gov/resource/ir4y-sesj.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60case_individual_id%60%2C%0A%20%20%60case_vehicle_id%60%2C%0A%20%20%60victim_status%60%2C%0A%20%20%60role_type%60%2C%0A%20%20%60seating_position%60%2C%0A%20%20%60ejection%60%2C%0A%20%20%60license_state_code%60%2C%0A%20%20%60gender%60%2C%0A%20%20%60transported_by%60%2C%0A%20%20%60safety_equipment%60%2C%0A%20%20%60injury_descriptor%60%2C%0A%20%20%60injury_location%60%2C%0A%20%20%60injury_severity%60%2C%0A%20%20%60age%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222019%22))%0A%20%20AND%20(caseless_one_of(%60role_type%60%2C%20%22Pedestrian%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%60injury_severity%60%2C%20%22Killed%22))")

Ind2020ped <- read.csv("https://data.ny.gov/resource/ir4y-sesj.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60case_individual_id%60%2C%0A%20%20%60case_vehicle_id%60%2C%0A%20%20%60victim_status%60%2C%0A%20%20%60role_type%60%2C%0A%20%20%60seating_position%60%2C%0A%20%20%60ejection%60%2C%0A%20%20%60license_state_code%60%2C%0A%20%20%60gender%60%2C%0A%20%20%60transported_by%60%2C%0A%20%20%60safety_equipment%60%2C%0A%20%20%60injury_descriptor%60%2C%0A%20%20%60injury_location%60%2C%0A%20%20%60injury_severity%60%2C%0A%20%20%60age%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222020%22))%0A%20%20AND%20(caseless_one_of(%60role_type%60%2C%20%22Pedestrian%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%60injury_severity%60%2C%20%22Killed%22))")

Ind2021ped <- read.csv("https://data.ny.gov/resource/ir4y-sesj.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60case_individual_id%60%2C%0A%20%20%60case_vehicle_id%60%2C%0A%20%20%60victim_status%60%2C%0A%20%20%60role_type%60%2C%0A%20%20%60seating_position%60%2C%0A%20%20%60ejection%60%2C%0A%20%20%60license_state_code%60%2C%0A%20%20%60gender%60%2C%0A%20%20%60transported_by%60%2C%0A%20%20%60safety_equipment%60%2C%0A%20%20%60injury_descriptor%60%2C%0A%20%20%60injury_location%60%2C%0A%20%20%60injury_severity%60%2C%0A%20%20%60age%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222021%22))%0A%20%20AND%20(caseless_one_of(%60role_type%60%2C%20%22Pedestrian%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%60injury_severity%60%2C%20%22Killed%22))")

Ind2022ped <- read.csv("https://data.ny.gov/resource/ir4y-sesj.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60case_individual_id%60%2C%0A%20%20%60case_vehicle_id%60%2C%0A%20%20%60victim_status%60%2C%0A%20%20%60role_type%60%2C%0A%20%20%60seating_position%60%2C%0A%20%20%60ejection%60%2C%0A%20%20%60license_state_code%60%2C%0A%20%20%60gender%60%2C%0A%20%20%60transported_by%60%2C%0A%20%20%60safety_equipment%60%2C%0A%20%20%60injury_descriptor%60%2C%0A%20%20%60injury_location%60%2C%0A%20%20%60injury_severity%60%2C%0A%20%20%60age%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222022%22))%0A%20%20AND%20(caseless_one_of(%60role_type%60%2C%20%22Pedestrian%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%60injury_severity%60%2C%20%22Killed%22))")

Ind2023ped <- read.csv("https://data.ny.gov/resource/ir4y-sesj.csv?$query=SELECT%0A%20%20%60year%60%2C%0A%20%20%60case_individual_id%60%2C%0A%20%20%60case_vehicle_id%60%2C%0A%20%20%60victim_status%60%2C%0A%20%20%60role_type%60%2C%0A%20%20%60seating_position%60%2C%0A%20%20%60ejection%60%2C%0A%20%20%60license_state_code%60%2C%0A%20%20%60gender%60%2C%0A%20%20%60transported_by%60%2C%0A%20%20%60safety_equipment%60%2C%0A%20%20%60injury_descriptor%60%2C%0A%20%20%60injury_location%60%2C%0A%20%20%60injury_severity%60%2C%0A%20%20%60age%60%0AWHERE%0A%20%20(%60year%60%20IN%20(%222023%22))%0A%20%20AND%20(caseless_one_of(%60role_type%60%2C%20%22Pedestrian%22)%0A%20%20%20%20%20%20%20%20%20AND%20caseless_one_of(%60injury_severity%60%2C%20%22Killed%22))")

#Combine the datasets
IndividualInfo <- rbind(Ind2019ped,Ind2020ped,Ind2021ped,Ind2022ped,Ind2023ped)

#Load shapefiles
Counties <- read_sf("data/Counties_Shoreline.shp")%>%
  mutate(NAME = toupper(NAME))

ReferenceMarkers <- read_sf("data/ReferenceMarkers.shp")

FatalAccidentsCounty <- CaseInfo %>%
  group_by(county_name) %>%
  summarize(n = n())
names(FatalAccidentsCounty)[1] <- "NAME"
names(FatalAccidentsCounty)[2] <- "Count"

MapJoin <- Counties %>%
 left_join(FatalAccidentsCounty, by = c("NAME"))

MapJoin2 <- ReferenceMarkers %>%
 inner_join(CaseInfo, by =c("PANEL"))

MapJoin3 <- MapJoin %>% 
  mutate(DeathsPer1k = (Count/POP2020)*100000)

Results

Fatalities Overtime (2019-2023)

As discussed in the methodology section above, the Individual Information dataset and the Case Information dataset both report fatalities over time, but the Case Information dataset only reports fatal collisions involving pedestrians generally and there is no way to determine whether the person who died was the motorist or the pedestrian.

Since both datasets contain valuable variables that are used in subsequent analyses, it is important to compare the results for reported fatalities over time. For the period 2019-2023, the Individual Information dataset reports 1,375 pedestrian fatalities, whereas the Case Information dataset reports 1,301 fatal accidents involving pedestrians. Although different, the graphs display similar trends. Both datasets indicate a small net increase in fatalities from 2019 to 2023. Each dataset also shows a noteworthy decrease in pedestrian fatalities in 2020 which can be explained by lockdowns and travel restrictions that kept everyone home during the height of the COVID-19 pandemic.

Show the code
#Plot 1: Fatalities over time (Individual Information vs. Case Information)
YearCount <- IndividualInfo%>%
group_by(year)%>%
summarize(n = n()) #1375

YearCountCase <- CaseInfo%>%
group_by(year)%>%
summarize(n = n()) #1301

YearCountJoin <- YearCount %>%
 left_join(YearCountCase, by = c("year"))

plot_ly(data = YearCountJoin, x=~year, y=~n.x, name = "Individual Information", type = "scatter", mode = "lines",line = list(width=3, color="darkred"))%>%
  add_trace(y= ~n.y, name = "Case Information", line = list(width=3, color="red"))%>%
  layout(title = list(text = "Individual Information vs. Case Information", font = list(color = "black"), y = .98, x = 0.5), legend = list(orientation = 'h', xanchor = 'center', yanchor = 'middle', x = 0.5, y = -0.3), yaxis = list(title='Fatalities', showgrid = F, showline = T, y = 5), xaxis = list(title = "Year", showgrid = F, showline = T))

Age

According to the U.S. DOT, the average age of pedestrians killed in motor vehicle accidents was 47 in 2021 (NHTSA, 2023). From 2019 to 2023, the Individual Information dataset reported a slightly higher average with pedestrians killed in motor vehicle crashes in NYS having an average age of 52. The distribution appears to be normal and does not yield any surprising results or age groups that appear to be overrepresented.

Show the code
#Plot 2: Individual Information Age
Age <-IndividualInfo$age 
Mean<- mean(IndividualInfo$age, na.rm=TRUE)

plot_ly(x=~Age, type = "histogram", nbinsx = 12, marker = list(color = "darkred"))%>%
  layout(title = list(text = "Histogram of Age", y = .98, x = 0.5, font = list(color = "black")), yaxis = list(title='Fatalities'))

Day of Week

The Case Information dataset reports the day of the week of each fatal accident involving a pedestrian. Out of 1,301 fatal accidents, the most fatal accidents happened on Saturdays (n = 209), and the least happened on Sundays (n = 157). It is possible that these results reflect the driving behaviors of working NYS residents who have Saturdays and Sundays off and may be more likely to engage in late-night or even reckless activities in the evenings when they don’t have to wake up early for work the next morning.

Show the code
#Plot 3: Case Information Day of Week
DayofWeek <- CaseInfo %>%
mutate(day_of_week = factor(day_of_week, levels = c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"),ordered = TRUE))%>%
  group_by(day_of_week)%>%
  count(day_of_week)
names(DayofWeek)[2] <- "Count"

plot_ly(data= DayofWeek, x=~day_of_week, y=~Count, type ="bar", marker = list(color = "darkred"))%>%
  layout(title = list(text = "Pedestrian Fatalities by Day of Week", y = .98, x = 0.5, font = list(color = "black")), yaxis = list(title='Fatalities'), xaxis= list(title=''))

Weather Conditions

Surprisingly, a majority of fatal accidents (n = 825) involving a pedestrian from 2019 to 2023 occurred during clear weather conditions. It is likely that poor weather conditions (e.g., rain and snow) deter people from walking, and under such conditions, pedestrians would rather drive or take public transportation than walk.

Show the code
#Plot 4: Case Information Weather 
Weather <- CaseInfo %>%
  group_by(weather_conditions)%>%
  count(weather_conditions)
names(Weather)[2] <- "Count"

plot_ly(data= Weather, x=~weather_conditions, y=~Count, type ="bar", marker = list(color = "darkred"))%>%
  layout(title = list(text = "Pedestrian Fatalities by Weather Conditions", y = .98, x = 0.5, font = list(color = "black")), yaxis = list(title='Fatalities'), xaxis= list(title='Weather Conditions'))

Pedestrian Action

The most common pedestrian action reported was “Crossing, No Signal or Crosswalk” (n = 481). This suggests that a lack of appropriate infrastructure creates dangerous conditions for pedestrians when crossing the road.

Show the code
#Plot 5: Case Information Pedestrian action
Action <- CaseInfo %>%
  group_by(pedestrian_bicyclist_action)%>%
  count(pedestrian_bicyclist_action)%>%
  arrange(desc(n))%>%
  head(5)

names(Action)[1] <- "Pedestrian Action"
names(Action)[2] <- "Count"

kable(Action)%>%
  kable_styling(full_width = F) %>%
  row_spec(0, color = "white", background = "darkred") 
Pedestrian Action Count
Crossing, No Signal or Crosswalk 481
Other Actions in Roadway 179
Crossing, Against Signal 134
Crossing, With Signal 123
Riding/Walking/Skating Along Highway With Traffic 95

Fatalities by County

The interactive map below shows the geospatial distribution of fatal motor vehicle accidents involving pedestrians in NYS. Even when controlling for population size, counties containing big cities feature more fatal accidents per 100,000 people than counties that don’t. One explanation for this trend is that these counties have higher population and urban development densities that are more appealing to pedestrians.

Show the code
#Plot 6: Fatalities per 100,000 people
#This website was helpful https://r-graph-gallery.com/183-choropleth-map-with-leaflet.html
#https://rstudio.github.io/leaflet/articles/legends.html
Transform <- st_transform(MapJoin3, crs = '+proj=longlat +datum=WGS84')
Transform2<- st_transform(MapJoin2, crs = '+proj=longlat +datum=WGS84')

text <- paste(
  "County: ", Transform$NAME, "<br/>", "Fatalities: ", round(Transform$DeathsPer1k, 2), sep = "") %>%
  lapply(htmltools::HTML)

pal <- colorNumeric(palette = "Reds", domain = Transform$DeathsPer1k)
  
leaflet(Transform)%>%
  addProviderTiles(providers$CartoDB.Positron)%>%
  setView(lng = -76.12524729, lat = 43.24235909, zoom = 5)%>%
  addPolygons(stroke = FALSE, , fillOpacity = 0.5,
  smoothFactor = 0.5, color = ~ colorQuantile("Reds", Count)(Count), label = text)%>%
  addLegend("bottomright", pal = pal, values = ~DeathsPer1k, title = "Fatalities per 100,000 people (2019-2023)", opacity = 1)

Fatalities by Reference Marker

Out of 1,301 reported collisions with pedestrians only 550 cases report a reference marker that can be used to pinpoint the approximate location of an incident. NYSDOT reference markers only exist on roads that are under the jurisdiction of the state. Therefore, many cases probably do not report a reference marker because they happened on a county or municipal-managed road. Roads that are under the control of the state are likely wider and may have higher speed limits compared to county and municipal roads. Take, for example, Buffalo, NY, where several accidents appear along Bailey Avenue and Broadway Street. Although the speed limit at this intersection is 30 mph street widths exceeding 70 feet in each direction likely encourage driving at higher speeds.

Show the code
#Plot7: Reference Marker
leaflet(Transform2)%>%
  addProviderTiles(providers$CartoDB.Positron)%>%
  setView(lng = -76.12524729, lat = 43.24235909, zoom = 5)%>%
  addCircleMarkers(color= "darkred", radius = 4, stroke = FALSE, fillOpacity = 0.5)

Conclusions

It is difficult to come to any concrete conclusions given the limitations of this research and the datasets that were explored. Variables such as age, day of the week, and poor weather conditions do not seem to play a major role in contributing to fatal crashes involving pedestrians. However, the analysis of pedestrian actions from 2019 to 2023 suggests that crashes are most common when pedestrians are crossing where there is no signal or crosswalk. Signals and crosswalks could be installed at a relatively low cost to improve pedestrian safety, especially in counties across NYS that feature more accidents per 100,000 people, such as Suffolk, Nassau, Albany, Monroe, and Onondaga County. The NYSDOT could also consider making longitudinal data available to the public so that trends can be more easily explored before 2019. Additional information could similarly be provided to improve our understanding of the disproportionate impacts of traffic violence on Black and Hispanic populations that suffer “higher traffic fatality rates per mile traveled than White Americans” (Raifman, 2022, p. 160). This information would be useful in understanding how we can advance mobility justice moving forward. Location information that is not connected to a NYS reference marker would also be helpful, as many accidents do not occur on NYS-owned and operated roads. Until more data is made available, continued investigation with these datasets could analyze gender, time of day, month of the year, lighting conditions, and the geospatial distribution of pedestrian fatalities within counties and at the municipal level.

References

NYS. (n.d.). Pedestrian safety. The State of New York. https://www.ny.gov/programs/pedestrian-safety

NYS Department of Health. (2024, June). Pedestrian safety: It’s no accident. https://trafficsafety.ny.gov/pedestrian-safety-doh#:~:text=Approximately 300 pedestrians are killed,the law to prevent injuries.

Raifman, M. A., & Choma, E. F. (2022). Disparities in Activity and Traffic Fatalities by Race/Ethnicity. American Journal of Preventive Medicine, 63(2), 160–167. https://doi.org/10.1016/j.amepre.2022.03.012