Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How did you generate the input data files like data.pkl, word2id and word_embedding.pkl ? #29

Open
yangliuy opened this issue Mar 5, 2019 · 4 comments

Comments

@yangliuy
Copy link

yangliuy commented Mar 5, 2019

Firstly thanks for the great ACL paper and open source code!

I have a question on the data preprocessing part. How did you generate the input data files like data.pkl, word2id,vocab.txt and word_embedding.pkl ? Let's take UDC as the example. The raw data only contains train.txt/valid.txt/test.txt. I checked your code and there are no scripts on generating these files like data.pkl and word_embedding.pkl. Could you also upload these data preprocessing scripts ?

@xyzhou-puck
Copy link
Collaborator

Hi,

We got those data by hacking the source code of SMN, to make sure that our experimental data sets are the same.

Xiangyang

@yangliuy
Copy link
Author

yangliuy commented Mar 5, 2019

Hi Xiangyang,

Thank you for your reply! I found a similar question here #5 . I will check the preprocessing code of SMN.

@xyzhou-puck
Copy link
Collaborator

You are welcome.

@MASTERPlECE
Copy link

Hi! Do you know how to deal with .w2v file? How to transfer it to word_embedding.pkl?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants