Skip to content

Performed feature engineering and data cleaning on text data using lemmatization techniques and stop word removals.

Notifications You must be signed in to change notification settings

jianninapinto/Coffee-Shops-Review-Analysis-using-NLP

Repository files navigation

Unit 4 - Sprint 13 - Natural Language Processing (NLP)

Assignment 1

The goal of the assignment is to find the attributes of the best & worst coffee shops in the dataset. The text is fairly raw: dates in the review, extra words in the star_rating column, etc. So, we want to clean the data up for a better analysis.

We will start analyzing the corpus of text using text visualizations of token frequency and cleaning the data using techniques such as lemmatization and stopword removal.

Based on the analysis, we will answer the question what makes the best, the best, and the worst, the worst? Graphs and numbers from the analysis should support the conclusions.

Authors

Releases

No releases published

Packages

No packages published