Comparing Unemployment Rates Across U.S. States During Economic Recessions

Author

Swetha

Published

November 14, 2024

Introduction

This project analyzes unemployment rate fluctuations across U.S. states during two major economic downturns: the 2008 financial crisis and the COVID-19 pandemic. By leveraging spatial and temporal data, we map and compare unemployment rate changes before, during, and after these recessions. The insights from this analysis may highlight patterns in economic impact and recovery across states, informing potential strategies for future recession management.

Materials and Methods

Data Sources

  1. U.S. Bureau of Labor Statistics (BLS): Monthly state-level unemployment data.
  2. U.S. Census Bureau: State population estimates.

Data Cleaning and Preparation

The following code loads, cleans, and aggregates the data. ### Load libraries

if (!requireNamespace("gganimate", quietly = TRUE)) install.packages("gganimate")
# Downloading packages -------------------------------------------------------
- Downloading gganimate from CRAN ...           OK [2 Mb in 0.34s]
- Downloading transformr from CRAN ...          OK [162.2 Kb in 0.26s]
- Downloading tweenr from CRAN ...              OK [443.8 Kb in 0.23s]
- Downloading sf from CRAN ...                  OK [3.6 Mb in 0.27s]
- Downloading classInt from CRAN ...            OK [487.3 Kb in 0.22s]
- Downloading e1071 from CRAN ...               OK [582.2 Kb in 0.35s]
- Downloading proxy from CRAN ...               OK [171.5 Kb in 0.25s]
- Downloading s2 from CRAN ...                  OK [2.1 Mb in 0.26s]
- Downloading wk from CRAN ...                  OK [1.7 Mb in 0.35s]
- Downloading units from CRAN ...               OK [354.5 Kb in 0.23s]
- Downloading lpSolve from CRAN ...             OK [365.4 Kb in 0.31s]
Successfully downloaded 11 packages in 5.5 seconds.

The following package(s) will be installed:
- classInt   [0.4-10]
- e1071      [1.7-16]
- gganimate  [1.0.9]
- lpSolve    [5.6.22]
- proxy      [0.4-27]
- s2         [1.1.7]
- sf         [1.0-19]
- transformr [0.1.5]
- tweenr     [2.0.3]
- units      [0.8-5]
- wk         [0.9.4]
These packages will be installed into "~/work/final-project-SwethaSakhamuri/final-project-SwethaSakhamuri/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing tweenr ...                         OK [installed binary and cached in 0.34s]
- Installing proxy ...                          OK [installed binary and cached in 0.2s]
- Installing e1071 ...                          OK [installed binary and cached in 0.22s]
- Installing classInt ...                       OK [installed binary and cached in 0.22s]
- Installing wk ...                             OK [installed binary and cached in 0.25s]
- Installing s2 ...                             OK [installed binary and cached in 0.34s]
- Installing units ...                          OK [installed binary and cached in 0.3s]
- Installing sf ...                             OK [installed binary and cached in 0.59s]
- Installing lpSolve ...                        OK [installed binary and cached in 0.21s]
- Installing transformr ...                     OK [installed binary and cached in 0.64s]
- Installing gganimate ...                      OK [installed binary and cached in 0.69s]
Successfully installed 11 packages in 4.4 seconds.
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")
if (!requireNamespace("sf", quietly = TRUE)) install.packages("sf")
if (!requireNamespace("maps", quietly = TRUE)) install.packages("maps")
# Downloading packages -------------------------------------------------------
- Downloading maps from CRAN ...                OK [2.9 Mb in 0.32s]
Successfully downloaded 1 package in 0.55 seconds.

The following package(s) will be installed:
- maps [3.4.2.1]
These packages will be installed into "~/work/final-project-SwethaSakhamuri/final-project-SwethaSakhamuri/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing maps ...                           OK [installed binary and cached in 0.26s]
Successfully installed 1 package in 0.32 seconds.
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2")
if (!requireNamespace("tidyverse", quietly = TRUE)) install.packages("tidyverse")
if (!requireNamespace("gifski", quietly = TRUE)) install.packages("gifski", type = "binary")
# Downloading packages -------------------------------------------------------
- Downloading gifski from CRAN ...              OK [956.9 Kb in 0.22s]
Successfully downloaded 1 package in 1.8 seconds.

The following package(s) will be installed:
- gifski [1.32.0-1]
These packages will be installed into "~/work/final-project-SwethaSakhamuri/final-project-SwethaSakhamuri/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing gifski ...                         OK [installed binary and cached in 0.24s]
Successfully installed 1 package in 0.28 seconds.
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
library(gganimate)
library(sf)
Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
library(maps)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.1
✔ readr     2.1.5     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
✖ purrr::map()    masks maps::map()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(gifski)

Load the monthly dataset

data <- read.csv("data/States_Unemploment_Data.csv")

Data Cleaning Steps

# 1. Drop duplicate rows
data <- data %>% distinct()

# 2. Remove rows with any missing values
data <- data %>% drop_na()

# 3. Replace placeholder values ("NA", "unknown") with NA
data[data == "NA"] <- NA
data[data == "unknown"] <- NA

# 4. Identify and clean inconsistent values (e.g., if certain columns should only contain specific ranges or categories)
# Example: Ensure `Unemployment_rate` is within a realistic range (0 - 100)
data <- data %>%
  filter(Unemployment_rate >= 0 & Unemployment_rate <= 100)

# Check for remaining missing values after cleaning
summary(data)
    State                Year         Period         
 Length:7344        Min.   :2006   Length:7344       
 Class :character   1st Qu.:2009   Class :character  
 Mode  :character   Median :2012   Mode  :character  
                    Mean   :2014                     
                    3rd Qu.:2019                     
                    Max.   :2022                     
 Labor_Force_Participation_Ratio Employment_Participation_Ratio
 Min.   :52.60                   Min.   :41.70                 
 1st Qu.:61.80                   1st Qu.:57.90                 
 Median :64.70                   Median :60.80                 
 Mean   :64.79                   Mean   :61.03                 
 3rd Qu.:68.10                   3rd Qu.:64.50                 
 Max.   :74.70                   Max.   :72.40                 
  Labor_Force         Employment        Unemployment     Unemployment_rate
 Min.   :  281549   Min.   :  272394   Min.   :   6835   Min.   : 1.900   
 1st Qu.:  779246   1st Qu.:  745945   1st Qu.:  40342   1st Qu.: 3.800   
 Median : 2056275   Median : 1920110   Median : 105112   Median : 5.100   
 Mean   : 3084129   Mean   : 2890735   Mean   : 193394   Mean   : 5.825   
 3rd Qu.: 3733868   3rd Qu.: 3477744   3rd Qu.: 234754   3rd Qu.: 7.600   
 Max.   :19592298   Max.   :18732265   Max.   :3015308   Max.   :30.600   
    Latitude       Longitude      
 Min.   :19.90   Min.   :-155.67  
 1st Qu.:35.52   1st Qu.:-105.03  
 Median :39.55   Median : -89.40  
 Mean   :39.56   Mean   : -93.34  
 3rd Qu.:43.30   3rd Qu.: -78.66  
 Max.   :63.59   Max.   : -69.45  

Aggregate data by State and Year to calculate the annual averages

annual_data <- data %>%
  group_by(State, Year) %>%
  summarise(
    Labor_Force_Participation_Ratio = mean(Labor_Force_Participation_Ratio, na.rm = TRUE),
    Employment_Participation_Ratio = mean(Employment_Participation_Ratio, na.rm = TRUE),
    Labor_Force = mean(Labor_Force, na.rm = TRUE),
    Employment = mean(Employment, na.rm = TRUE),
    Unemployment = mean(Unemployment, na.rm = TRUE),
    Unemployment_rate = mean(Unemployment_rate, na.rm = TRUE),
    Latitude = first(Latitude),
    Longitude = first(Longitude)
  ) %>%
  ungroup()
`summarise()` has grouped output by 'State'. You can override using the
`.groups` argument.
# Display the first few rows of the aggregated data
head(annual_data)
# A tibble: 6 × 10
  State    Year Labor_Force_Participation_R…¹ Employment_Participa…² Labor_Force
  <chr>   <int>                         <dbl>                  <dbl>       <dbl>
1 Alabama  2006                          61.6                   59.1    2169933.
2 Alabama  2007                          61.3                   58.8    2180868.
3 Alabama  2008                          60.6                   57.0    2177201.
4 Alabama  2009                          59.6                   53.6    2157016.
5 Alabama  2010                          59.5                   53.3    2196694.
6 Alabama  2011                          59.1                   53.5    2201954.
# ℹ abbreviated names: ¹​Labor_Force_Participation_Ratio,
#   ²​Employment_Participation_Ratio
# ℹ 5 more variables: Employment <dbl>, Unemployment <dbl>,
#   Unemployment_rate <dbl>, Latitude <dbl>, Longitude <dbl>

Visualization Techniques

  1. State-Level Map Visualization
  2. Time-Series Trend Charts
  3. Bar Chart Race

Results

Map Visualization

The following animated map shows the unemployment rate across states over time. The color scale represents unemployment rate levels.

us_map <- map_data("state")
annual_data$region <- tolower(annual_data$State)

# Merge map data with your dataset
map_data <- inner_join(us_map, annual_data, by = c("region" = "region"))
Warning in inner_join(us_map, annual_data, by = c(region = "region")): Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1 of `x` matches multiple rows in `y`.
ℹ Row 1 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.
# Animated Map Code
ggplot(map_data, aes(long, lat, group = group, fill = Unemployment_rate)) +
  geom_polygon(color = "white") +
  scale_fill_gradientn(colors = c("#2D004B", "#4144B3", "#1EA8C1", "#A8E200", "#FFDD00", "#FF6600"),
                       values = scales::rescale(c(2, 4, 6, 8, 10, 16)),
                       name = "Unemployment Rate (%)",
                       limits = c(2, 16)) +
  labs(title = "Unemployment Rate by State", subtitle = "Year: {frame_time}") +
  coord_fixed(1.3) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 14)
  ) +
  transition_time(Year) +
  ease_aes('linear')

Bar Chart Race of Top 10 States by Unemployment Rate

A bar chart race visualizes the top 10 states with the highest unemployment rate each year.

top_states <- annual_data %>%
  group_by(Year) %>%
  top_n(10, Unemployment_rate) %>%
  arrange(Year, -Unemployment_rate)

# Bar Chart Race Code
ggplot(top_states, aes(x = reorder(State, -Unemployment_rate), y = Unemployment_rate, fill = State)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(
    title = "Top 10 States by Unemployment Rate",
    subtitle = "Year: {frame_time}",
    x = "State",
    y = "Unemployment Rate (%)"
  ) +
  theme_minimal() +
  transition_time(Year) +
  ease_aes('linear')

Conclusion

The graphical analysis of U.S. unemployment rates during the 2008 financial crisis and the COVID-19 pandemic highlights significant trends and disparities in economic resilience across states.

  1. From the animated map visualization, we observe that states with diverse, service-based economies (such as California and New York) experienced sharp rises in unemployment during both crises. However, these states tended to recover faster due to a shift toward remote work and flexible job structures, especially during COVID-19.

  2. States heavily reliant on specific sectors like tourism and manufacturing (e.g., Nevada and Michigan) faced prolonged high unemployment rates. This suggests that sector-specific economic dependencies can heighten vulnerability during global economic disruptions.

  3. The line plot of unemployment trends over time reveals that while the 2008 financial crisis caused a relatively uniform spike in unemployment across states, the COVID-19 pandemic led to more varied impacts, with certain states being hit much harder than others. This difference may be attributed to varying public health restrictions, digital adaptability, and government support measures across states.

  4. The bar chart race visualization, showing the top 10 states with the highest unemployment rates over time, highlights that certain states consistently appeared in the high-unemployment category across crises. This indicates persistent structural challenges that may require targeted policy interventions.