Campaign_Classifier

This project uses a supervised machine learning model to classify Federal Election Commission data on campaign spending into one of nine categories (media, digital, polling, legal, field, consulting, fundraising, and administrative). The model was written by James Scharf and Conner Delahanty.

A major challenge working with FEC data is that campaigns and committees use a variety of terms to describe expenditures that fall within the same category of spending. Our model addresses this issue in two ways. First, we select an initial set of keywords based on our definition of each category and add additional terms that appear frequently in our training and testing data. Second, we use the Datamuse API, a word-finding query engine, to identify synonyms for our keywords in each category.

We train and test our model using the scikit-learn SGDClassifier. The classifier relies on the following packages: NLTK, NumPy, Pandas; and Python 3+.

For additional information on the model and its application , see Sheingate, Adam; Scharf, James; Delahanty, Conner (2022): Digital Advertising in U.S. Federal Elections, 2004-2020. Preprint. https://doi.org/10.6084/m9.figshare.19372421.v2

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
SGDClassifier1June2021.py		SGDClassifier1June2021.py
categories_1June2021.json		categories_1June2021.json
comboTraining-test1June2021.tsv		comboTraining-test1June2021.tsv
comboTraining-train1June2021.tsv		comboTraining-train1June2021.tsv
comboTraining1June2021.csv		comboTraining1June2021.csv
purpid04_20May28.zip		purpid04_20May28.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SGDClassifier1June2021.py

SGDClassifier1June2021.py

categories_1June2021.json

categories_1June2021.json

comboTraining-test1June2021.tsv

comboTraining-test1June2021.tsv

comboTraining-train1June2021.tsv

comboTraining-train1June2021.tsv

comboTraining1June2021.csv

comboTraining1June2021.csv

purpid04_20May28.zip

purpid04_20May28.zip

Repository files navigation

Campaign_Classifier

About

Releases

Packages

Languages

sheingate/Campaign_Classifier

Folders and files

Latest commit

History

Repository files navigation

Campaign_Classifier

About

Topics

Resources

Stars

Watchers

Forks

Languages