The high-level goal of this project is to apply text analysis techniques, including sentiment analysis and topic modeling, to State of the Union addresses. To add to the existing body of work that already accomplishes these goals, an extension of the project is to determine whether sentiment analysis of State of the Union addresses can be used to predict presidential approval.
UCSB’s The American Presidency Project
- Corpus built by scraping State of the Union transcripts
- Output stored as .txt file for each address, including both written and spoken addresses
Roper Center for Public Opinion Research Presidential Approval Project
- Database of public presidential polls starting from 1942
├── scripts
│ ├── approval.py # Code for aggregating historical polls
│ ├── evaluate.py # Helpfer functions to plot confusion matrix
│ ├── feature_gen.py # Scripts to generate features for predicting approval
│ ├── sentiment_analyzer.py # Analyze sentiment of speeches
│ └── sotu_scraper.py # Webscraping to obtain speech corpus
│
├── sentiment_analysis.ipynb # Notebook for analyzing SoTU sentiment scores
│
├── approval_analysis.ipynb # Notebook for approval data preprocessing and feature generation
│
├── models.ipynb # Notebook for model fitting, selection, and evaluation
│
├── bigram_analysis.ipynb # Notebook for text bigram analysis
│
├── topic_modeling.ipynb # Notebook for text topic modeling using LDA, making word clouds
│
├── tmpp-presentation.pdf # Project presenation, (5/29/18)
│
└── README.md
- Pablo Martinez Monsivais/Getty Images, State of the Union Photo Gallery
- Jonathan Bouchet, NLP analysis on the SOTU addresses
- Jennifer Dixon, Presidential Speech Analysis
- Frank Evan, Topic Modeling of the State of the Union Address
- FiveThirtyEight, Presidential approval poll aggregator
- UCSB, The American Presidency Project
- Roper Center for Public Opinion Research, Presidential Approval Project