Skip to content

Latest commit

 

History

History
51 lines (34 loc) · 1.35 KB

README.md

File metadata and controls

51 lines (34 loc) · 1.35 KB

DURHAM STREAM SCRAPER

This library retrieves data about news, events, and happenings from the Triangle area.

This is in beta form and at the moment only gathers data by retrieving timelines by specified users.

Gathering Tweets from Durham Tweeters

You will need authentication from Twitter and will have to store your keys in a config/twitter/twitter_keys.yml file in order to use this library. See config/twitter/twitter_client.rb for more details.

To interact via a console session, run

ENV=development bundle exec ./console

To import all tweets to pass to the RTP Events API:

tg = DurhamScraper::TwitterImporter.create!
tg.recent_durham_tweets_by_username         

This will return Tweet objects (see the twitter gem for additional info)

Importing to Events Service

Clone the Durham Stream Service, install dependencies, and then launch the server locally (see Durham Stream Service README).

Then, from the command line, run:

ENV=development bundle exec ruby bin/import

Or, launch a console session and run the following:

i = DurhamScraper::Importer.new
i.upload_to_api

TO DO:

  • Regex parsing of tweets for purpose of adding tags
  • Scrape INDY weekly or some other site(s)
  • Set up to deploy via capistrano
  • Write tests for Onboarder