SentimentAnalysis

Sentiment analysis to get people's sentiments about company services classified by dates, services and places. For this case from people in DKI Jakarta for services from PT. PLN regarding electricity in DKI Jakarta in the scope of Twitter social media, especially those with opinions on PLN's official Twitter account.

How is work? You can use PyCharm for much easy to install all library that i use it. Include : tweepy, numpy, Sastrawi (Stemmer for Indonesian words), sklearn, and pandas. You can download PyCharm in site : https://www.jetbrains.com/pycharm/download/#section=windows And don't forget to install python 3.8 and sync to your PyCharm. this project can't work at version 2.x.x or 4.x.x above (if python already release version 4)

Run Crawling.py to get data for new people's opinion about services (in this case regarding electricity in DKI Jakarta). Don't forget before crawling, you must get your consumer api keys and access token & access token secret from your app. You can get it in site : https://developer.twitter.com/

Run SentimentAnalysis.py for get a few bar charts of infographic people's sentiments that classified by dates, services and places.

Run Evaluation.py for get results of total accuracy of SentimentAnalysis.py.

Notes :

-list_cleaned_tweets.txt is text for all cleaned tweets after results of function text_preprocessing

-list_daerah.txt is text for all places that want to be classified, in this case in DKI Jakarta, such as Kotamadya, Kecamatan and Kelurahan. Places divided by "|", which is format like this "Kelurahan | Kecamatan | Kotamadya"

-normalization_words.txt is text for all words that must be normalized, for example slank words, abbreviation, and another non-standard words.

-tweets_all.txt, tweets_all_2.txt, etc is text for all results text after crawling based on query search in Crawling.py this tweets divided by dates and tweets, which is format like this "Date | Tweet"

-tweets_predicted_labels.txt is results label after classified sentiment based on tweets_training.txt labels use K-Nearest Neighbor (K-NN) method

-tweets_testing.txt is a data testing to classify which data positive sentiment, netral sentiment and negative sentiment, which is format like this "Date | Tweet"

-tweets_training.txt is a data training as basis sentiment for classify, which is format like this "Label | Tweet"

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Crawling.py		Crawling.py
Evaluation.py		Evaluation.py
README.md		README.md
SentimentAnalysis.py		SentimentAnalysis.py
list_cleaned_tweets.txt		list_cleaned_tweets.txt
list_daerah.txt		list_daerah.txt
normalization_words.txt		normalization_words.txt
tweets_actual_labels.txt		tweets_actual_labels.txt
tweets_all.txt		tweets_all.txt
tweets_all_10.txt		tweets_all_10.txt
tweets_all_11.txt		tweets_all_11.txt
tweets_all_12.txt		tweets_all_12.txt
tweets_all_13.txt		tweets_all_13.txt
tweets_all_14.txt		tweets_all_14.txt
tweets_all_15.txt		tweets_all_15.txt
tweets_all_16.txt		tweets_all_16.txt
tweets_all_17.txt		tweets_all_17.txt
tweets_all_18.txt		tweets_all_18.txt
tweets_all_19.txt		tweets_all_19.txt
tweets_all_2.txt		tweets_all_2.txt
tweets_all_20.txt		tweets_all_20.txt
tweets_all_21.txt		tweets_all_21.txt
tweets_all_22.txt		tweets_all_22.txt
tweets_all_23.txt		tweets_all_23.txt
tweets_all_24.txt		tweets_all_24.txt
tweets_all_3.txt		tweets_all_3.txt
tweets_all_4.txt		tweets_all_4.txt
tweets_all_5.txt		tweets_all_5.txt
tweets_all_6.txt		tweets_all_6.txt
tweets_all_7.txt		tweets_all_7.txt
tweets_all_8.txt		tweets_all_8.txt
tweets_all_9.txt		tweets_all_9.txt
tweets_predicted_labels.txt		tweets_predicted_labels.txt
tweets_testing.txt		tweets_testing.txt
tweets_training.txt		tweets_training.txt

msuyudia/SentimentAnalysis

Folders and files

Latest commit

History

Repository files navigation

SentimentAnalysis

About

Topics

Resources

Stars

Watchers

Forks

Languages