Skip to content

Applied Machine Learning and Natural Language Processing to build a Random Forest Classifier that filters malicious bots with 97% of accuracy.

Notifications You must be signed in to change notification settings

bigforehead/Twitter-Malicious-Bot-Detection-Project

Repository files navigation

Twitter-Malicious-Bot-Detection-Project

Introduction

  • "Can my product ads reach real users on Instagram and avoid bots (ex: fake followers) ?". YES! but HOW? The answer for this question can be found here. As a group of data scientists at Fordham, we worked in a team to build a Random Forest Classifier with 97% accuracy to help filter out those social bots utilizing Twitter data. The model can applied for any other social media platforms such as Instagram, Fakebook, and etc. So this model can help you improve your ad revenue by maximize the possibility of Ad organic exposure on social media.

Goal

  • The project's purpose is to classify three types of malicious twitter bots of Fake Followers, Spam Bots, and Scam Bots.

Methodology

  • Scrape Tweets using Twitter API (Tools: Python).
  • Machine Learning: Random Forest. (Tools: SPSS Modeler)
  • Natural Language Processing: Derive new TFIDF features to analyze Tweets for three identified bots (Tools: nltk).

Result

  • A Random Forest model with accuracy rate of 97%.

About

Applied Machine Learning and Natural Language Processing to build a Random Forest Classifier that filters malicious bots with 97% of accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published