Skip to content

tor-gu/njelections

Repository files navigation

njelections

R-CMD-check

This is a data package for R containing the results of statewide elections in NJ, from 2004 to 2021.

Installation

You can install the development version of njelections like so:

# install.packages("devtools")
devtools::install_github("tor-gu/njelections")

Dataset Overview

This package contains the results of statewide general elections for three offices:

  • New Jersey Governor
  • US Senate
  • US President

All elections from 2004 to 2021 are included, at three levels of organization:

  • Statewide (election_statewide)
  • By county (election_by_county)
  • By municipality (election_by_municipality)

Table election_statewide

Table election_statewide contains the following columns, which are common to all tables in this dataset:

Field Type Description Example
year int Election year 2004
type chr Currently always ‘General’ General
office chr ‘President’, ‘Senate’, or ‘Governor’ President
candidate chr Candidate name John F. Kerry
party chr Candidate party Democratic
vote int Number of votes received 1911430

There is one row in this table for every year, office, and candidate combination.

Example
year type office candidate party vote
2004 General President John F. Kerry Democratic 1911430
2004 General President George W. Bush Republican 1670003
2004 General President Ralph Nader Independent 19418

Table election_by_county

Table election_by_county contains all of the columns in election_statewide, plus two more:

Field Type Description Example
GEOID chr US Census GEOID for the county 34001
county chr County name Atlantic County

There is one row in this table for every year, office, county and candidate combination. In particular, for a given year and office, every candidate is represented in every county.

Example
year type office GEOID county candidate party vote
2004 General President 34001 Atlantic County John F. Kerry Democratic 55746
2004 General President 34003 Bergen County John F. Kerry Democratic 207666
2004 General President 34005 Burlington County John F. Kerry Democratic 110411

Table election_by_municipality

Table election_by_municipality contains all of the columns in election_statewide, plus three more:

Field Type Description Example
GEOID chr US Census GEOID for the municipality 3400100100
county chr County name Atlantic County
municipality chr Municipality name Absecon city

There is one row in this table for every year, office, county, municipality and candidate combination. In particular, for a given year and office, every candidate is represented in every municipality.

Example
year type office GEOID county municipality candidate party vote
2004 General President 3400100100 Atlantic County Absecon city John F. Kerry Democratic 1800
2004 General President 3400100100 Atlantic County Absecon city George W. Bush Republican 2177
2004 General President 3400100100 Atlantic County Absecon city Ralph Nader Independent 25

Notes

Data source

The source for this data is the New Jersey Division of Elections. The data was derived by scraping the PDFs in the election results archive.

NJ municipalities

New Jersey municipalities have not been stable over the course of 2004-2021:

  • Several municipalities have changed names or been assigned new GEOIDs by the US Census.
  • In 2013, Princeton borough and Princeton township merged

The njmunicipalities package contains municipality names and GEOIDs across the period 2001-2021. The election_by_municipality table uses the names and GEOIDs from the njmunicipalities package for the year of the election, with the exception of the Princetons for the 2012 election. See Accounting for changing municipal names for a worked example dealing with these issues.

Princeton and the 2012 election

At the time of the 2012 election, Princeton borough and Princeton township were still separate municipalities. However, the official results for Mercer County provide only the combined results for the merged Princeton municipalities.

As a result, the election_by_municipality table uses the 2013 municipality list from njmunicipalities for the 2012 election. The Princeton merger is the only difference in the 2012 and 2013 municipality list.

Candidate and party names

In general, an attempt was made to record candidate and party names exactly as they appear in the official results. However, when the same candidate or party appears in multiple elections with slightly varying names, the most common form of the name was used.

For example, Jeff Boss has appeared in official results variously as ‘Jeff Boss’, ‘Jeffrey Boss’ and ‘Jeffery “Jeff” Boss’. In this package, his name has been standardized to “Jeff Boss”.

Similarly, the Green and Libertarian party names have been standardized to “Green Party” and “Libertarian Party”.

When a candidate does not have a listed party, the party is recorded as “Independent”.

Consistency across levels

State vs county

For every year, office and candidate combination, the vote total across counties exactly matches the vote total in the statewide results:

library(dplyr)
# Statewide election matches sum of county votes for every year and every office
election_by_county |>
  group_by(year, type, office, candidate) |>
  summarize(county_vote = sum(vote), .groups = "drop") |>
  left_join(election_statewide,
             by = c("year", "type", "office", "candidate")) |>
  filter(vote != county_vote) |>
  nrow()
#> [1] 0

County vs municipality

The sum across municipalities does not always match the county total. In many – but not all – cases, the official county results account for the discrepancy. For example, the official 2020 Presidential results from Morris County include federal overseas votes in a separate row, not assigned to any municipality. These discrepancies, even when explicitly included in the official results, are not recorded in this package.

Examples

Displaying in ‘wide’ format

library(dplyr)
library(tidyr)
library(njelections)
hudson_senate_2012 <- election_by_municipality |> 
  filter(year == 2012, 
                office == "Senate", 
                county == "Hudson County") |>
  select(GEOID, municipality, party, vote) |>
  pivot_wider(names_from = party, values_from = vote) |>
  select(GEOID, municipality, Democratic, Republican, 
         Libertarian = `Libertarian Party`, Green = `Green Party`)
GEOID municipality Democratic Republican Libertarian Green
3401703580 Bayonne city 12735 5067 98 166
3401719360 East Newark borough 356 59 2 1
3401728650 Guttenberg town 2366 500 20 18
3401730210 Harrison town 2458 520 22 31
3401732250 Hoboken city 12819 5695 210 184
3401736000 Jersey City city 56469 8077 414 565
3401736510 Kearny town 6706 2559 52 93
3401752470 North Bergen township 15187 3196 54 100
3401766570 Secaucus town 3940 1942 21 37
3401774630 Union City city 14094 2169 56 79
3401777930 Weehawken township 3429 915 41 55
3401779610 West New York town 9166 2139 26 60

Accounting for changing municipal names

Over the period 2004-2021, several municipalities changed names and GEOIDs, and Princeton township was merged into Princeton borough. The package njmunicipalities is helpful here.

As an example, let consider Mercer county, which includes the merged Princetons, as well as Robbinsville township, previously known as Washington township. Let’s plot the two-party share of votes for each municipality in Mercer, using the current name for each municipality, and combining the totals for the Princetons in the years prior to the merger.

First, generate a cross reference table for the GEOIDs, using the 2021 GEOIDs and municipality names as the reference. We use njmunicipalities::get_geoid_cross_reference and njmunicipalities::get_municipalities for this.

library(njmunicipalities)
geoid_xref <- get_geoid_cross_references(2021, 2004:2021) |>
  dplyr::filter(!is.na(GEOID_ref)) |>
  dplyr::left_join(get_municipalities(2021), by = c("GEOID_ref" = "GEOID"))
year GEOID_ref GEOID county municipality
2004 3400100100 3400100100 Atlantic County Absecon city
2004 3400102080 3400102080 Atlantic County Atlantic City city
2004 3400107810 3400107810 Atlantic County Brigantine city
2004 3400108680 3400108680 Atlantic County Buena borough
2004 3400108710 3400108710 Atlantic County Buena Vista township

Now, generate the two-party share of the vote, combining Princeton borough and township. The constants PRINCETON_TWP_GEOID and PRINCETON_BORO_GEOID come from njmunicipalities.

tpsov <- njelections::election_by_municipality |>
  dplyr::mutate(GEOID = dplyr::if_else(GEOID == PRINCETON_TWP_GEOID,
                                       PRINCETON_BORO_GEOID,
                                       GEOID)) |>
  dplyr::group_by(year, office, GEOID, party) |>
  dplyr::summarize(vote = sum(vote), .groups = "drop") |>
  dplyr::filter(party %in% c("Democratic", "Republican")) |>
  dplyr::group_by(year, office, GEOID) |>
  dplyr::summarize(party = party, 
                   two_party_share_of_vote = vote/sum(vote), .groups="drop")
year office GEOID party two_party_share_of_vote
2004 President 3400100100 Democratic 0.4526025
2004 President 3400100100 Republican 0.5473975
2004 President 3400102080 Democratic 0.7595311
2004 President 3400102080 Republican 0.2404689
2004 President 3400107810 Democratic 0.4536190

Finally, combine the two tables and plot.

library(ggplot2)
tpsov |>
  dplyr::left_join(geoid_xref, by = c("year", "GEOID")) |>
  dplyr::filter(county == "Mercer County") |>
  ggplot(aes(x = year, y = two_party_share_of_vote, color = party)) +
  scale_color_manual(values = c("Democratic" = "blue", "Republican" = "red")) +
  geom_point() + 
  geom_smooth(se = FALSE, formula = y ~ x, method = "loess") + 
  facet_wrap("municipality") +
  ylab("Two party share of vote") +
  xlab("Election year") +
  labs(title = "Mercer County, NJ, two party share of vote",
       subtitle = "US Senate, President and Governor races, 2004-2021")