New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce your results #1
Comments
Thank you for your interest. This is a preliminary web service with major implementation included. To reproduce the numbers in paper, additional steps are required (need to slightly change the code.. but I will add them soon):
Due to time constraints, the current version of web service supports CPU environment only, but it will have more features released in the next update. |
hi @andyweizhao thank you for your swift reply. I understand that the code base is not using the MNLI model. However, the correlation I computed is still worse than those shown in the By the way, do you apply this trick (removing subwords) for all of the studies in your paper? For example, do you also use this trick for the HMD + BERT in table 5? |
Hi George, I forgot one additional step: TF-IDF weights are required. I will try to fix these issues this week. When combining BERT-MNLI, TF-IDF and removing subwords, you will see the similar numbers as the ones below in my server (wmd-unigram): I used this trick in all tasks and most of language pairs except "fi-en" and "lv-en".. |
I just updated the repo to support the reproducibility on MT. I will close the current issue, please create new ones if you have additional questions. |
Wow thank you for making this happen. This is very helpful. I try to run the codes but seems like some of the files are missing, namely the translation data. Could you be so kind to also upload them? |
Sure thing. I just uploaded them. |
Hi, thanks for the great work! cs-en pearson: 0.67 I'm attaching the result of running pip freeze > requirements.txt Do you have any ideas on the cause of the difference? Thank you! |
Hi Alex, |
That makes sense. Thanks a lot for the clarification! |
Hey there,
Thank you for putting up this repo. I quickly run your method, the word mover distance with unigram, on the WMT17 de-en language pair, and the pearson correlation is only 0.645, quite worse from what you report in the paper. Can you double check the code release?
Also, it takes me 8 mins to run on these 560 sentences. Is this expected or am I doing something wrong?
The text was updated successfully, but these errors were encountered: