Can't load the model! #11

Evraa · 2021-10-19T19:42:05Z

Greetings,

Actually I'm surprised that such an error came up, my problem lies with this line

model = torch.load("models/DialoFlow_large/model.bin")

model.bin is placed appropriately, and EC2 works with cuda 11.2 and pytorch = 1.9.

Where would the problem come from?

Thanks in advance

The text was updated successfully, but these errors were encountered:

Evraa · 2021-10-19T19:46:16Z

Error:

Traceback (most recent call last):
File "generate.py", line 283, in
model = torch.load("models/DialoFlow_large/model.bin")
File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 787, in _legacy_load
result = unpickler.load()
ModuleNotFoundError: No module named 'transformers.modeling_gpt2'

lizekang · 2021-10-20T02:58:14Z

You can try to lower the version of transformers. You can try transformers==3.1.0.

Evraa · 2021-10-20T11:47:35Z

Thank you for your fast reply.

It worked.
But got stuck in another problem concerning versions, I guess.

Current versions:
torch = 1.7.0
transformers = 3.1.0
pickle = 4.0
regex = 2.5.103

Error:
Traceback (most recent call last):
File "generate.py", line 283, in
model = torch.load("model.bin")
File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 595, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 774, in _legacy_load
result = unpickler.load()
ModuleNotFoundError: No module named '_regex'

Evraa · 2021-10-20T11:48:35Z

Please, don't refer me to issue #2 .
It didn't work for me, also it's in Chinese, and I'm not familiar with the language :'D.

thank you

lizekang · 2021-10-20T11:59:07Z

You can try regex==2018.1.10. I should work.

Evraa · 2021-10-20T12:04:29Z

Thank you very much.

Could you please tell me how to structure test.refs.txt, is it just sentences separated by '\n' ?
and what is the minimum/maximum number of utterances allowed?

thank you in advance

lizekang · 2021-10-20T12:12:10Z

The structure is like the following:
utterance1 EOS utterance2 EOS utterance3 \t reference1 \t reference 2 \t reference 3 \t\n
There is no specific constraint for minimum/maximum number of utterances.

Evraa · 2021-10-20T12:16:06Z

I'm sorry, what are "reference" ?
Could you provide an example?

Evraa · 2021-10-20T12:25:23Z

Also, this error came up!

Traceback (most recent call last):
File "generate.py", line 298, in
hypstr = beam_search(history, tokenizer, model, args)
File "generate.py", line 215, in beam_search
delta = work_delta(model, conv_seq, sentence_idx, token_type_seq)
File "generate.py", line 95, in work_delta
conv_hidden_state = model.speak_model(conv_seq, token_type_ids=token_type_seq)[0]
File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_gpt2.py", line 527, in forward
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 219, in use_return_dict
return self.return_dict and not self.torchscript
AttributeError: 'GPT2Config' object has no attribute 'return_dict'

lizekang · 2021-10-21T06:58:12Z

In the dialogue dataset, there are many possible responses. The responses collected in advance are the references.
For example:
utterance1: What's your hobby?
reference1: I like basketball.
reference2: Reading. What about you?
reference3: Tell me yours first.

lizekang · 2021-10-21T06:59:28Z

For the error, I never meet this. Maybe you can try downgrade the transformers to 3.0.0 or 2.7.0. I'm not sure.

Evraa · 2021-10-25T15:04:43Z

Worked with transformers version 3.0.0
Not 2.7.0 nor 3.1.0

One last question, what exactly does "generate.py" script produce?
taking Dialogue and some reference responses .. what exactly is the output hypstr[0]?

thanks in advance

Evraa · 2021-10-25T17:06:22Z

Follow up question, the variable "responses" in "generate.py" ln: 293.
What is the purpose of it?

dongqian0206 · 2021-10-27T10:58:12Z

Hi Zekang,

Thanks for providing the source code.

I just followed this post. Suppose I don't want to use the older version of transformers, due to python environment issue, then pre-training DialoFlow based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness of DialoFlow is independent of if it was pre-trained on Reddit dataset.

Thanks in advance.

Best,

Dong

dongqian0206 · 2021-11-18T08:04:55Z

Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?

Hi Evram,

Based on my understanding, the evaluation task is to generate a response given a context. For example,

context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].

I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance.

Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.

["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."]
["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."]
["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]

As the input indices are [speaker1, text1, eos, empty, speaker2, text2, eos, empty], one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.

# if o in [eos, empty, speaker1, speaker2]:
#     continue

Looking forward to hearing from you @Evraa, as well as @lizekang.

Best,

Dong

lizekang · 2021-11-18T08:32:37Z

Hi Zekang,

Thanks for providing the source code.

I just followed this post. Suppose I don't want to use the older version of transformers, due to python environment issue, then pre-training DialoFlow based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness of DialoFlow is independent of if it was pre-trained on Reddit dataset.

Thanks in advance.

Best,

Dong

Hi, sorry for the late response. The effectiveness of DialoFlow is independent. It should be also effective without pre-training on Reddit dataset.

dongqian0206 · 2021-11-18T08:37:59Z

Hi Zekang,
Thanks for providing the source code.
I just followed this post. Suppose I don't want to use the older version of transformers, due to python environment issue, then pre-training DialoFlow based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness of DialoFlow is independent of if it was pre-trained on Reddit dataset.
Thanks in advance.
Best,
Dong

Hi, sorry for the late response. The effectiveness of DialoFlow is independent. It should be also effective without pre-training on Reddit dataset.

OK. Thanks a lot!

lizekang · 2021-11-18T08:40:12Z

Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?

Hi Evram,

Based on my understanding, the evaluation task is to generate a response given a context. For example,

context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].

I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance.

Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."]
["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."]
["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]
As the input indices are [speaker1, text1, eos, empty, speaker2, text2, eos, empty], one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.
# if o in [eos, empty, speaker1, speaker2]:
#     continue
Looking forward to hearing from you @Evraa, as well as @lizekang.

Best,

Dong

For the corresponding to different speakers, there are two ways: 1) we insert special tokens like speaker1 and speaker2. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).

For this question, during the generation, we don't want to see the model generates special tokens inside the response.

# if o in [eos, empty, speaker1, speaker2]:
#     continue

dongqian0206 · 2021-11-18T09:35:20Z

Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?

Hi Evram,
Based on my understanding, the evaluation task is to generate a response given a context. For example,

context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].

I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance.
Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."]
["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."]
["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]
As the input indices are [speaker1, text1, eos, empty, speaker2, text2, eos, empty], one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.
# if o in [eos, empty, speaker1, speaker2]:
#     continue
Looking forward to hearing from you @Evraa, as well as @lizekang.
Best,
Dong
For the corresponding to different speakers, there are two ways: 1) we insert special tokens like speaker1 and speaker2. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).

For this question, during the generation, we don't want to see the model generates special tokens inside the response.
# if o in [eos, empty, speaker1, speaker2]:
#     continue

Thank you for your prompt reply and to see if I can explain my confusion clearly.

For the corresponding to different speakers, there are two ways: 1) we insert special tokens like speaker1 and speaker2. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).

I agree on this part. During training, the input index is defined as [speaker1, text1, eos, empty, speaker2, text2, eos, empty], which shows the correspondence.

But it might be different from that during generation. For example, given the first utterance from speaker1, the input index [speaker1, text1, eos, empty] is provided, as well as the segment index.

Please see the segment result from your code, where '0' corresponds to speaker1 and '1' corresponds to speaker2. Longer outputs exhibit a similar pattern (just more 1's).
tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

My confusion: as this task is multi-turn dialogue generation, the output is essentially one-round dialogue. The model doesn't know how to further switch the speaker identity, while it knows 0 -->1, as the information is provided by the user.

if len(conv) % 2 == 1:
   current_output = [speaker2]
else:
   current_output = [speaker1]

During training, ground-truth sequences are provided, while during generation, sequences are generated in an autoregressive manner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't load the model! #11

Can't load the model! #11

Evraa commented Oct 19, 2021

Evraa commented Oct 19, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 21, 2021

lizekang commented Oct 21, 2021

Evraa commented Oct 25, 2021

Evraa commented Oct 25, 2021

dongqian0206 commented Oct 27, 2021

dongqian0206 commented Nov 18, 2021 •

edited

lizekang commented Nov 18, 2021

dongqian0206 commented Nov 18, 2021

lizekang commented Nov 18, 2021

dongqian0206 commented Nov 18, 2021 •

edited

Can't load the model! #11

Can't load the model! #11

Comments

Evraa commented Oct 19, 2021

Evraa commented Oct 19, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 20, 2021

Evraa commented Oct 20, 2021

Evraa commented Oct 20, 2021

lizekang commented Oct 21, 2021

lizekang commented Oct 21, 2021

Evraa commented Oct 25, 2021

Evraa commented Oct 25, 2021

dongqian0206 commented Oct 27, 2021

dongqian0206 commented Nov 18, 2021 • edited

lizekang commented Nov 18, 2021

dongqian0206 commented Nov 18, 2021

lizekang commented Nov 18, 2021

dongqian0206 commented Nov 18, 2021 • edited

dongqian0206 commented Nov 18, 2021 •

edited

dongqian0206 commented Nov 18, 2021 •

edited