Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run the preprocess.py , ValueError: invalid literal for int() with base 10: ',' #2

Closed
zhailiang041219 opened this issue May 4, 2019 · 4 comments

Comments

@zhailiang041219
Copy link

Thank you for sharing this wonderful tutorial. I downloaded the glove.42B.300d file and run the preprocess.py code,but occurd an error: ValueError: invalid literal for int() with base 10: ',' have you ever come across this error? how can i slove it ?I am looking forward for your reply. Thank you .

@AlexYangLi
Copy link
Owner

Hi, @zhailiang041219 . Did this error occur when you ran process_raw.py? Cause i only use int() in process_raw.py, not in preprocess.py. But it works fine for me.

@zhailiang041219
Copy link
Author

Thanks for your reply . I find that the glove.42B.300d file download from the website cannot be used directly. I have solved the problem . thank you.

@AlexYangLi
Copy link
Owner

Hi, @zhailiang041219 . I might know what cause the error. I use gensim.models.KeyedVectors to load glove embeddings, it requires to provide vocab_size and embedding_dim at the first line of the embedding file. Howerver the glove.42B.300d file dosen't contain such information, use gensim.models.KeyedVectors to load it directly will cause eror.

So I explictly add this information to the head of glove file after I download it from the website. You can see the command in process.sh. But, of course you can load the embedding file in a way whatever you like :)

@zhailiang041219
Copy link
Author

zhailiang041219 commented May 5, 2019 via email

@AlexYangLi AlexYangLi pinned this issue Jul 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants