Skip to content

ruettet/twittercorp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

twittercorp

Make a regionally aware Twitter corpus.

The Python script does not rely on the geocoding of the tweets, but rather on the reported location of the twitter user.

How to proceed:

  1. make sure you have the dependencies for this script: python-twitter 1.0 (https://code.google.com/p/python-twitter/) and geopy 0.95.1 (https://code.google.com/p/geopy/).

  2. apply for a Twitter API key via https://dev.twitter.com/apps/new and enter it in tw.py

  3. make a list of the locations that you want to scrape, and put them (one location per line) in cities.txt

  4. provide tw.py with a number of initial seeds (like newspapers or politicians) that reveal their location. (via settings.txt)

  5. check the other fields that you need to fill in in settings.txt

Just run tw.py for some time and the corpus will grow. If something goes wrong, just start it again and the script should pick up where it ended.

About

Make a regionally aware Twitter corpus

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published