The classic sentiment analysis project, now with a slight different twist in the level of challenge.
In this project, we will build a streamlined sentiment analysis model with sklearn pipeline and various machine learning models that we are all familiar with. This streamlined pipeline will have the ability to take in raw text corpus and return the scored sentiment. The specific explanations of implementation of this project are summarized in the two-part blog series on medium below.
- Clone the repo locally
- In your terminal, navigate to the directory where you have just cloned this repo
- Type
pip install poetry
. This will install the package management system poetry which will help you install all the required dependencies - Type
poetry install
- Type
pip install jupyter
. Now you should be able to view the notebook that generated the above analysis
The StreamlinedModel
object allow us to build a transformer-model pipeline structure. This pipeline has the flexibility to use any transformer/model combinantion. For example, we can use the sklearn built-in CountVectorizer and logistic regression model to build a pipeline as the following
logistic = StreamlinedModel(
transformer_description="Bag of words",
transformer=CountVectorizer,
model_description="logisitc regression model",
model=LogisticRegression,
)
We will build our own bag of word and TF-IDF transformer, which could be used to input transformer argument to the StreamlinedModel
.