Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 598 Bytes

File metadata and controls

4 lines (3 loc) · 598 Bytes

Big_Data_Project_US-Airlines_Tweet_Processing_and_Analysis

Big data application of Machine Learning concepts for sentiment classification of US Airlines tweets. The focus is on the usage of pyspark libraries (ml-lib) on big data to solve a problem using Machine Learning algorithms and not about the choice of algorithm used in the ML model creation. It also involves data pre-processing using NLP techniques, cross-validation and parameter-grid builder.

Input and output files used have been attached in the repositories, the urls used are only for the sake of usage in the databricks cluster.