Streamlined NLP Pipeline Modeling on Sentiment Analysis

The classic sentiment analysis project, now with a slight different twist in the level of challenge.

In this project, we will build a streamlined sentiment analysis model with sklearn pipeline and various machine learning models that we are all familiar with. This streamlined pipeline will have the ability to take in raw text corpus and return the scored sentiment. The specific explanations of implementation of this project are summarized in the two-part blog series on medium below.

Understanding Text Vectorizations I: How Having a Bag of Words Already Shows What People Think About Your Product

Understanding Text Vectorizations II: How TF-IDF Gives Your Simple Models the Power to Rival the Advanced Ones

Environment Setup

Clone the repo locally
In your terminal, navigate to the directory where you have just cloned this repo
Type pip install poetry. This will install the package management system poetry which will help you install all the required dependencies
Type poetry install
Type pip install jupyter. Now you should be able to view the notebook that generated the above analysis

Streamlined Model

The StreamlinedModel object allow us to build a transformer-model pipeline structure. This pipeline has the flexibility to use any transformer/model combinantion. For example, we can use the sklearn built-in CountVectorizer and logistic regression model to build a pipeline as the following

logistic = StreamlinedModel(
    transformer_description="Bag of words",
    transformer=CountVectorizer,
    model_description="logisitc regression model",
    model=LogisticRegression,
)

Customized Transformers

We will build our own bag of word and TF-IDF transformer, which could be used to input transformer argument to the StreamlinedModel.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
notebooks		notebooks
sentiment_analysis		sentiment_analysis
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

sentiment_analysis

sentiment_analysis

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Streamlined NLP Pipeline Modeling on Sentiment Analysis

Environment Setup

Streamlined Model

Customized Transformers

About

Releases

Packages

Languages

chen-bowen/Streamlined_Sentiment_Analysis

Folders and files

Latest commit

History

Repository files navigation

Streamlined NLP Pipeline Modeling on Sentiment Analysis

Environment Setup

Streamlined Model

Customized Transformers

About

Resources

Stars

Watchers

Forks

Languages