Response in medium model decreases overtime. #84

bossauh · 2022-03-12T09:36:12Z

I'm using a modified version of the example code provided in the huggingface website.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch


tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")


step = 0
while True:

    # encode the new user input, add the eos_token and return a tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input("You: ") + tokenizer.eos_token, return_tensors='pt')

    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # generated a response while limiting the total chat history to 1000 tokens
    chat_history_ids = model.generate(
        bot_input_ids, 
        pad_token_id=tokenizer.eos_token_id,
        max_length=1000,
        sample=True,
        top_k=50,
        top_p=0.95,
        repetition_penalty=1.35
    )

    # pretty print last ouput tokens from bot
    print(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True))
    step += 1

After some few lines, the responses starts become shorter and shorter until it just doesn't output anything anymore.

You: hello
Hello! :D
You: coffee tastes so good lol
It does. I'm going to have a coffee tomorrow morning, and it's gonna be delicious haha
You: what's your type of coffee?
I don't drink coffee but if you want some I'll make one for ya! It will probably taste like a cup of tea though...
You: Yes please make me one
Sure thing!
You: I prefer light coffee alright?
Yeah that sounds great!
You: to be honest i just drink coffee for the sake of tasting good lol
Haha okay then ill try my best with this one too XD
You: cool
Sounds awesome ltsss
You: do you prefer your coffee dark or light
ooooo
You: what?
yayyy
You: what are you saying?

You:

I tried changing the max_length to about 5000 and it doesn't seem to do anything. I've tried getting rid of the history (i.e., just using the new_user_input_ids variable) and it seemed to have fix the issue but that obviously leads me to a very random-like response as it has no context on what we're talking about

The text was updated successfully, but these errors were encountered:

archmagos-dominus · 2022-03-25T13:15:54Z

Try to tweak the repetition_penalty=1.35. maybe the model just feels like it's running out of tokens to output without repeating itself.

iamatulsingh · 2022-03-30T19:15:20Z

Same problem I'm facing. tweaking repetition_penalty=0.8 will not gonna help you. Even you can remove repetition_penalty and try with 1000 token history. The history become more then 1000 and its having the issue. Suppose if you are using 5000 token history then at some point it will cross that limit too if you are leaving bot to talk longer.

archmagos-dominus · 2022-03-30T20:22:37Z

You might be correct, and this might be an issue with the length of the chat_history_ids. You can try something like this then:

#check the size of the hist, if it's over 'max_lenght' remove the oldest tokens
            if bot_input_ids.size(dim=1) >= args.get('max_length'):
                #trim the tokens
                bot_input_ids = torch.narrow(bot_input_ids, 1, -args.get('max_length'), args.get('max_length'))
            chat_history_ids = model.generate(
                bot_input_ids,
                max_length=args.get('max_length')+20,
                pad_token_id=tokenizer.eos_token_id,
                no_repeat_ngram_size=args.get('no_repeat_ngram_size'),
                do_sample=args.get('do_sample'),
                top_k=args.get('top_k'),
                top_p=args.get('top_p'),
                temperature=args.get('temperature')
                )

Try to tweak the max_lenght value and see what's the maximum you can go to without issues.

As for what would cause this issue, I don't really know what to say.

iamatulsingh · 2022-03-30T21:13:43Z

Thanks for the help but it seems kinda weird way to do. RAM will be consumed by this a lot if bot will talk for a long time. There should be a way to trim the history token. Like if it crossed last 5 token history then it should drop the oldest chat history token.

csanadpoda · 2022-04-16T13:33:43Z

I'm having the same issue, increaing max_length or tweaking repetition_penalty doesn't help. It mostly won't last 15 rounds until it starts exclusively returning EOS tokens. Even setting max_length to like 50000 doesn't change this. DialoGPT still regresses to returning empty after circa 10 user inputs.

Before this it will start to give shorter and shorter answers. Any idea what can cause such a behavior?

I'm using the medium-sized model with a max_length of 2000. I get conversations like the one below. What causes this eventual shortening and then disappearance of replies and what can I do to change this? I'd like to be able to hold a conversation indefinitely.


> User: What makes you happy?
> Bot: I'm a big fan of the game, and I love my family.
> User: Who is in your family?
> Bot: My mom's husband and sister are both in it too!
> User: What about your mom?
> Bot: She's not in it yet but she will be soon enough :D
> User: How old is your sister?
> Bot: she was born on December 13th so she'll probably be around for another year or two at least...
> User: Do you like her?
> Bot: Yeah she's pretty cool. She has been playing with me since we were kids haha
> User: How old are you?
> Bot: About 7 years now?
> User: That's pretty young to be online...
> Bot: haha
> User: Aren't you afraid?
> Bot: Yayyyy
> User: Yay what?
> Bot: Hah
> User: please talk to me
> Bot: heeee
> User: tell me a story
> Bot: o
> User: what?
> Bot: u
> User: What about me?
> Bot:
> User: You're done huh
> Bot:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Response in medium model decreases overtime. #84

Response in medium model decreases overtime. #84

bossauh commented Mar 12, 2022

archmagos-dominus commented Mar 25, 2022

iamatulsingh commented Mar 30, 2022

archmagos-dominus commented Mar 30, 2022 •

edited

iamatulsingh commented Mar 30, 2022

csanadpoda commented Apr 16, 2022 •

edited

Response in medium model decreases overtime. #84

Response in medium model decreases overtime. #84

Comments

bossauh commented Mar 12, 2022

archmagos-dominus commented Mar 25, 2022

iamatulsingh commented Mar 30, 2022

archmagos-dominus commented Mar 30, 2022 • edited

iamatulsingh commented Mar 30, 2022

csanadpoda commented Apr 16, 2022 • edited

archmagos-dominus commented Mar 30, 2022 •

edited

csanadpoda commented Apr 16, 2022 •

edited