Skip to content

Wrapper of Gensim word2vec along with T-SNE visualization

Notifications You must be signed in to change notification settings

scarletcho/runWord2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

runWord2vec

  • Gensim 라이브러리를 활용한 word2vec 훈련 및 시각화 스크립트입니다.
  • This is a wrapper of Gensim word2vec along with T-SNE visualization.

Requirements

  • Before running runWord2vec, make sure Gensim python library is installed.

  • Gensim can be easily installed by:

      $ pip install gensim
    

Data preparation

  • What to prepare:
    • A text file which has one sentence per line
    • NB. To train a set of quality word embeddings, your corpus needs to be sufficiently large.

Functionality

  • What can be done:
    • Given a text file (a corpus which has one sentence per line) in the same directory as the script, you can train your own word embeddings using the following scripts using Gensim library.

Usage

1) runWord2vec.py

  • Train & save word2vec model by the following command:

      $ python runWord2vec.py <corpus_name> <model_name>
    
  • For example:

      $ python runWord2vec.py wiki.txt mdl_wiki
    

2) runTSNE.py

  • Visualize your trained model by the following command:

      $ python runTSNE.py <model_name>
    
  • For example:

      $ python runTSNE.py mdl_wiki
    

About

Wrapper of Gensim word2vec along with T-SNE visualization

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages