Skip to content

dtuit/twitter_scraper

Repository files navigation

Twitter_Scraper

NOTE: currently under development

Scrapes tweets from twitter.com and inserts into a SQL server database.
Uses Celery the asynchronous task queue as a framework.
Tested on Ubuntu 14.04 with pyhton 3.4

Install requirements

  • Python
  • Celery
    • pip install Celery
  • pymssql
    • sudo apt-get install freetds-dev freetds-bin
    • pip install pymssql
  • requests
  • lxml
    • sudo apt-get install python3-lxml
  • cssselect
    • pip install cssselect
  • RabbitMQ
    • sudo apt-get install rabbitmq-server

create a file keys.json file which contains the SQL server connection parameters

{
    "server":  "SERVER.database.windows.net",
    "user": "USER@SERVER",
    "password": "password",
    "database": "databasename"
}

note: Use the --recursive option when cloning to also clone the submodule

Releases

No releases published

Packages

No packages published

Languages