-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't load the model! #11
Comments
Error: Traceback (most recent call last): |
You can try to lower the version of transformers. You can try transformers==3.1.0. |
Thank you for your fast reply. It worked. Current versions: Error: |
Please, don't refer me to issue #2 . thank you |
You can try regex==2018.1.10. I should work. |
Thank you very much. Could you please tell me how to structure test.refs.txt, is it just sentences separated by '\n' ? thank you in advance |
The structure is like the following: |
I'm sorry, what are "reference" ? |
Also, this error came up! Traceback (most recent call last): |
In the dialogue dataset, there are many possible responses. The responses collected in advance are the references. |
For the error, I never meet this. Maybe you can try downgrade the transformers to 3.0.0 or 2.7.0. I'm not sure. |
Worked with One last question, what exactly does "generate.py" script produce? thanks in advance |
Follow up question, the variable "responses" in "generate.py" ln: 293. |
Hi Zekang, Thanks for providing the source code. I just followed this post. Suppose I don't want to use the older version of Thanks in advance. Best, Dong |
Hi Evram, Based on my understanding, the evaluation task is to generate a response given a context. For example,
I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance. Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
As the input indices are
Looking forward to hearing from you @Evraa, as well as @lizekang. Best, Dong |
Hi, sorry for the late response. The effectiveness of |
OK. Thanks a lot! |
For the corresponding to different speakers, there are two ways: 1) we insert special tokens like For this question, during the generation, we don't want to see the model generates special tokens inside the response.
|
Thank you for your prompt reply and to see if I can explain my confusion clearly.
I agree on this part. During training, the input index is defined as But it might be different from that during generation. For example, given the first utterance from speaker1, the input index Please see the segment result from your code, where '0' corresponds to speaker1 and '1' corresponds to speaker2. Longer outputs exhibit a similar pattern (just more 1's). My confusion: as this task is multi-turn dialogue generation, the output is essentially one-round dialogue. The model doesn't know how to further switch the speaker identity, while it knows 0 -->1, as the information is provided by the user.
During training, ground-truth sequences are provided, while during generation, sequences are generated in an autoregressive manner. |
Greetings,
Actually I'm surprised that such an error came up, my problem lies with this line
model = torch.load("models/DialoFlow_large/model.bin")
model.bin is placed appropriately, and EC2 works with cuda 11.2 and pytorch = 1.9.
Where would the problem come from?
Thanks in advance
The text was updated successfully, but these errors were encountered: