See PDF in repo or https://www.sharelatex.com/6679699354fdqvmrgbtkrw
train test epochs: ./05-evaluate.py ../data/dataset-00-0.csv ../data/dataset-00-1.csv 5
extract_features
: Read tweets fromelection_day_tweets
andtroll-tweets
wikidata
: Vectorize intotweets_output
cleaningDataSet.R
: troll-weets ->dataset
, election_day_tweets ->cleaned_elec2.csv
election_day_tweets header: 0. text
- created_at
- geo
- lang
- place
- coordinates
- user.favourites_count
- user.statuses_count
- user.description
- user.location
- user.id
- user.created_at
- user.verified
- user.following
- user.url
- user.listed_count
- user.followers_count
- user.default_profile_image
- user.utc_offset
- user.friends_count
- user.default_profile
- user.name
- user.lang
- user.screen_name
- user.geo_enabled
- user.profile_background_color
- user.profile_image_url
- user.time_zone
- id
- favorite_count
- retweeted
- source
- favorited
- retweet_count
output+ 34 created_str 35 hashtags 36 mentions
extracting_features.py
converts election_day_tweets.csv
to election_day_tweets_output.csv
- user_id
- user_key
- created_at
- created_str
- retweet_count
- retweeted
- favorite_count
- text
- tweet_id
- source
- hashtags
- expanded_urls
- posted
- mentions
- retweeted_status_id
- in_reply_to_status_id