Silkworm - US presidential election result predictor

Introduction

Silkworm forecasts who will win the US presidential election using two sets of input data:

State polls
Previous election results

In 2020, my model correctly forecast 49 out of 51 states (including DC). The two that the model called wrongly are Florida and North Carolina, both of which were fairly big misses. Overall, opinion polls took a beating in 2020, and the discussion about what went wrong has already begun (see for example https://www.pewresearch.org/fact-tank/2020/11/13/understanding-how-2020s-election-polls-performed-and-what-it-might-mean-for-other-kinds-of-survey-work/). Until the accuracy of opinion polling improves, the kind of pool aggregation work I'm doing with Silkworm should probably sit on hold.

You can see a video of my (pre-election) PyData Boston presentation on Silkworm here: https://www.youtube.com/watch?v=5Pnr0wbuUzM&t=23s

I've blogged a lot on forecasting US Presidential elections. Here are some of my blog posts:

How it works

It aggregates polls using a rolling 7-day window, arranging the polls by candidate spread and choosing the median poll. The prior election result is used to 'seed' the analysis in all cases. Where no polls exist, the prior election result is used. From the polling (or election) data, Silkworm calculates a daily candidate win probability.

The software combines the state/candidate winning probabilities using a generator polynomial. This gives a daily electoral college vote forecast.

The software is based on the work of Sam Wang of the Princeton Election Consortium (http://election.princeton.edu/). I took his MATLAB code and made an extensive series of modifications. The code has diverged so much, they are essentially different projects at this stage.

Where the data comes from

Election result data comes from Wikipedia.

2020 opinion poll data comes from 538: https://projects.fivethirtyeight.com/polls/president-general/

Opinion poll data prior to 2020 comes from the Huffington Post Pollster API (http://elections.huffingtonpost.com/pollster/api). This API is no longer updated.

Python libraries used

The software uses the following libraries:

Pandas - used extensively.
Bokeh - used for visualization.
Requests - used to retrieve data from the Huffington Post API.

Running the software

Install Bokeh version 2.2.1 or higher
Download the project to a folder called sikworm
From the folder above silkworm, type in bokeh serve --show silkworm
Navigate to the tab 'Run/load forecast'
Load the 2020 analysis.
Explore the data!

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
controller		controller
documentation		documentation
model		model
static		static
view		view
.gitignore		.gitignore
README.md		README.md
main.py		main.py
theme.yaml		theme.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

controller

controller

documentation

documentation

model

model

static

static

view

view

.gitignore

.gitignore

README.md

README.md

main.py

main.py

theme.yaml

theme.yaml

Repository files navigation

Silkworm - US presidential election result predictor

Introduction

How it works

Where the data comes from

Python libraries used

Running the software

About

Releases 6

Packages

Languages

MikeWoodward/Silkworm

Folders and files

Latest commit

History

Repository files navigation

Silkworm - US presidential election result predictor

Introduction

How it works

Where the data comes from

Python libraries used

Running the software

About

Topics

Resources

Stars

Watchers

Forks

Languages