Skip to content

hautran7201/skip_gram_for_document_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document classification by skip gram (Negative sampling)

Perform embedding of words in the text so that it has the highest relationship with the embedding of the document label.

Running

Create skip gram dataset for training

python data/generated_data/generate_data.py

Train skip gram model

python train_word2vec.py

Train classifier

python classifier.py

Classification results

Train Test Validation
1.0 0.973 0.979

Reference

Pythonic Excursions: Optimize Computational Efficiency of Skip-Gram with Negative Sampling