Before use, make sure to replace the <INSERT_API_KEY> in .env file with your API key, which can be obtained from this link (https://newsapi.org/)
Used the newsapi to pull the news articles for Bitcoin and Ethereum and created a DataFrame of sentiment scores for each coin. Also, calculated descriptive statistic as well for both coins. Below are the answers to questions based on those statistics.
Which coin had the highest mean positive score?
- Bitcoin
Which coin had the highest compound score?
- Ethereum
Which coin had the highest positive score?
- Bitcoin
Which coin had the highest negative score?
- Ethereum
Used NLTK and python to tokenize the text for each coin. Following techniques used.
- Lowercase each word
- Remove Punctuation
- Remove Stopwords
Then NGrams and Frequency Analysis performed
- Used NLTK to produce the n-grams (n=2) for each coin
- Listed Top 10 words for each coin
Then produced the Word Clouds for each coin
Bitcoin Word Cloud
Ethereum Word Cloud
Built a named entity recognition model for both Bitcoin and Ethereum and then visualized the tags using SpaCy.
Bitcoin NER Visualization
Ethereum NER Visualization