GitHub - josephcappadona/opinion-mining-amazon-reviews: CoreNLP dependency parsing + word2vec + VADER sentiment analysis => opinion mining

Steps

1. Preprocess

Downloads reviews and review metadata. Review metadata contains information on the products and subcategories in a given category heading. For example, the "Electronics" category contains the subcategories "Computers & Accessories", "Speakers", "Headphones", etc.

2. Word2Vec

Downloads and builds Google's word2vec library. Copies over filtered review data from Step 1 into the current directory. Creates a corpus of sentences based on this review data, and runs word2vec to transform the vocabulary from the corpus into vectors such that words that share similar contexts are located in close proximity in the vector space. Word2vec then clusters these word embeddings to creates "classes" of words (i.e., groups of words which have strong semantic and syntactic relations).

3. Double Propagation

Downloads Stanford CoreNLP library. Copies filtered review data from Step 2 into the current directory. Starts up CoreNLP server locally. Runs Qiu, et al.'s double propagation algorithm to discover product features by bootstrapping off of a base opinion word lexicon of known opinion words (e.g., good, bad, green).

TODOs

Compile double propagation output into tables + graphics for analysis
Build functions to look at correlation between opinion word sentiment and the propability that the word occurs in a positive review vs negative review, possibly use this information (or VADER information) in lieu of how sentiment is propgated in the NN->JJ case (Intra-review Rule)
change example category to one with smaller files

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
double_prop		double_prop
preprocess		preprocess
word2vec		word2vec
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

double_prop

double_prop

preprocess

preprocess

word2vec

word2vec

README.md

README.md

Repository files navigation

Steps

1. Preprocess

2. Word2Vec

3. Double Propagation

TODOs

About

Releases

Packages

Contributors 4

Languages

josephcappadona/opinion-mining-amazon-reviews

Folders and files

Latest commit

History

Repository files navigation

Steps

1. Preprocess

2. Word2Vec

3. Double Propagation

TODOs

About

Resources

Stars

Watchers

Forks

Languages