Worldly's goal is to allow anyone to get an understanding of how different parts of the world respond to international events. I scraped data from 5 different nnewspapers and used NMF Topic Modeling to extract topics from over 3 years worth of data. Worldly is a D3.js tool however, there are seaborn graphs that also display article volume through time in notebook number 3. Feel free to use any code that you think is useful!
The newspapers include:
a. New York Times (U.S.)
b. La Republica (Peru)
c. The Guardian (U.K.)
d. Times of India (India)
e. China Today (China)
f. The Age (Australia)
This is how the notebooks are organized:
- Data Acquisition (scrapping & API's), Cleaning, Storing (MongoDB, Pickle)
- (a-f) Topic Modeling(NMF)/ Exploration
- Country by country visualizations of topics over time
Some of the Interesting Insights I found:
- India focused on Brexit before anyone else
- Peru and the U.S. focused on the arrival of Pope Francis
- The U.K. covered Zika the most even though it's in the least danger