Skip to content

ds-elections/rcv

 
 

Repository files navigation

rcv

Build Status

rcv helps you work directly with raw ballot image and cast vote record data to tabulate and visualize ranked choice election results.

Features

  • Read in ballot image and master lookup files
  • Merge these files to get a "readable" ballot layout
  • Conduct elections, and view a round-by-round table of results
  • Visualize the flow of voters with an interactive Sankey diagram
  • Compatible with dplyr/magrittr pipe syntax (%>%)

Installation

We're on CRAN now! You can install our latest release (0.2.2):

install.packages("rcv")

Or, you can install the development version of rcv here:

devtools::install_github("ds-elections/rcv")

Created by:

The style of this README is inspired by the googlesheets R package.

Basic Workflow Demo

sf_bos_ballot is included as an example raw ballot image, and sf_bos_lookup as an example raw master lookup. Both are included as .rdas, and they are in the "WinEDS" format. This data comes from the 2016 San Francisco Board of Supervisors elections (San Francisco Department of Elections).

head(sf_bos_ballot)
## # A tibble: 6 x 1
##                                              X1
##                                           <chr>
## 1 000000900000660300000010020000406001000012800
## 2 000000900000660300000010020000406002000000001
## 3 000000900000660300000010020000406003000000001
## 4 000000900000660400000010020000406001000012300
## 5 000000900000660400000010020000406002000012500
## 6 000000900000660400000010020000406003000012100
head(sf_bos_lookup)
## # A tibble: 6 x 1
##                                                                            X1
##                                                                         <chr>
## 1 Candidate 0000121SAMUEL KWONG                                      00000010
## 2 Candidate 0000131TIM E. DONNELLY                                   00000010
## 3 Candidate 0000133DEAN PRESTON                                      00000010
## 4 Candidate 0000135JOEL ENGARDIO                                     00000010
## 5 Candidate 0000140MELISSA SAN MIGUEL                                00000010
## 6 Candidate 0000144AHSHA SAFAI                                       00000010

Cleaning Data

The streamlined version of this process is done with the clean_ballot() function. b_header and l_header are logical values, based on whether the ballot and lookup file respectively have a header for the first row.

cleaned <- clean_ballot(ballot = sf_bos_ballot, b_header = T, 
                        lookup = sf_bos_lookup, l_header = T, 
                        format = "WinEDS")
knitr::kable(head(readable(cleaned)))
contest pref_voter_id 1 2 3
Board of Supervisors, District 1 000006603 SANDRA LEE FEWER NA NA
Board of Supervisors, District 1 000006604 MARJAN PHILHOUR DAVID LEE SAMUEL KWONG
Board of Supervisors, District 1 000006605 DAVID LEE RICHIE GREENBERG BRIAN J. LARKIN
Board of Supervisors, District 1 000006606 MARJAN PHILHOUR DAVID LEE SANDRA LEE FEWER
Board of Supervisors, District 1 000006607 BRIAN J. LARKIN ANDY THORNLEY JASON JUNGREIS
Board of Supervisors, District 1 000006608 MARJAN PHILHOUR NA NA

To access intermediate steps, the following process can be used.

# Import and label ballot image
a <- import_data(data = sf_bos_ballot, header = T) %>%
    label(image = "ballot", format = "WinEDS")

# Import and label master lookup
b <- import_data(data = sf_bos_lookup, header = T) %>%
    label(image = "lookup", format = "WinEDS")

# Merge these two tables
c <- characterize(ballot = a, lookup = b, format = "WinEDS")

knitr::kable(head(readable(c)))
contest pref_voter_id 1 2 3
Board of Supervisors, District 1 000006603 SANDRA LEE FEWER NA NA
Board of Supervisors, District 1 000006604 MARJAN PHILHOUR DAVID LEE SAMUEL KWONG
Board of Supervisors, District 1 000006605 DAVID LEE RICHIE GREENBERG BRIAN J. LARKIN
Board of Supervisors, District 1 000006606 MARJAN PHILHOUR DAVID LEE SANDRA LEE FEWER
Board of Supervisors, District 1 000006607 BRIAN J. LARKIN ANDY THORNLEY JASON JUNGREIS
Board of Supervisors, District 1 000006608 MARJAN PHILHOUR NA NA

The readable() function takes the clean image, which is formatted for ease in computation, and formats it to be easily read manually.

Running Elections

This is done with the rcv_tally() function. sf_bos_clean is included as an example of a pre-cleaned ballot using the functions above. We will run the District 1 election from this ballot image.

results <- rcv_tally(sf_bos_clean, "Board of Supervisors, District 1")
knitr::kable(results)
candidate round1 round2 round3 round4 round5 round6 round7 round8 round9
SANDRA LEE FEWER 12550 12689 12777 12840 13029 13093 13225 13354 14705
MARJAN PHILHOUR 11067 11135 11247 11348 11487 11680 11837 12086 13126
DAVID LEE 3396 3408 3488 3551 3622 3857 3961 4093 NA
RICHIE GREENBERG 974 984 1042 1220 1272 1386 1508 NA NA
BRIAN J. LARKIN 747 773 832 896 956 997 NA NA NA
SAMUEL KWONG 740 744 760 785 814 NA NA NA NA
JONATHAN LYENS 609 652 679 726 NA NA NA NA NA
JASON JUNGREIS 611 626 654 NA NA NA NA NA NA
SHERMAN R. D'SILVA 557 566 NA NA NA NA NA NA NA
ANDY THORNLEY 359 NA NA NA NA NA NA NA NA
NA 3499 3532 3630 3743 3929 4096 4578 5576 7278

Sandra Lee Fewer wins in Round 9, with 14,705 votes to Marjan Philhour's 13,126. 3,499 ballots were left blank for this election, and 3,779 were exhausted or marked invalid in this election.

Visualizing Data

We have two recommended methods of visualizing RCV data. Both utilize a flowchart called a "Sankey diagram" to show the transfer of voters between rounds. We will use each method to visualize the transfer of voters in the San Francisco District 7 Board of Supervisors election, because District 1 has too many crossings to be readable.

Method 1 (preferred because it is interactive, quicker, and more readable) uses the networkD3 package:

d3_7 <- rcv::make_d3list(results = sf_7_results)
networkD3::sankeyNetwork(Links = d3_7$values, Nodes = d3_7$names,
                         Source = "source", Target = "target",
                         Value = "value", NodeID = "candidate", units = "voters",
                         fontSize = 12, nodeWidth = 20)

Method 2 uses the alluvial package (this type of graphic is also called an alluvial diagram):

alluvial_7 <- rcv::make_alluvialdf(image = sf_bos_clean,
                                   rcvcontest = "Board of Supervisors, District 7")
alluvial::alluvial(
  alluvial_7[,1:4], freq = alluvial_7$frequency,
  col = ifelse(alluvial_7$round4 == "NORMAN YEE", "lightgreen", "gray"),
  border = "gray", alpha = 0.7, blocks = TRUE
)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%