Skip to content
/ fnlp Public

[OUTDATED] A set of classes/scripts for NLP tasks focused on finance data

License

Notifications You must be signed in to change notification settings

hardikp/fnlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fnlp

This repo contains scripts to train NLP models using the text data.

Dependencies

Train new GloVe vectors

glove.py contains a GloVe model written in pytorch. dataset.py contains a Dataset class - it is written in a way so that torch.utils.data.DataLoader utility class of pytorch can be used for training.

$ python3 glove.py --input wiki_data.txt --batch_size 512

Check the word Vectors

Trained word vectors are available on the releases page.

Let's check if the closest words make sense.

$ python3 test_word_vectors.py --word IRA
roth, iras, sep, 401, contribute

$ python3 test_word_vectors.py --word option
call, options, put, exercise, underlying

$ python3 test_word_vectors.py --word stock
shares, share, market, stocks, price

Notes

This CPU-only implementation is not yet optimized. For training on CPU, it might be best to download the Glove software from here.

Credits

License

MIT