Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoder输出长度是有限制吗? #11

Open
MarsMeng1994 opened this issue Nov 2, 2023 · 5 comments
Open

decoder输出长度是有限制吗? #11

MarsMeng1994 opened this issue Nov 2, 2023 · 5 comments

Comments

@MarsMeng1994
Copy link

parser.add_argument('--base_model', default="llama-2-7b-chat-hf/", type=str)
parser.add_argument('--lora_weights', default="tloen/alpaca-lora-7b", type=str,
                    help="If None, perform inference on the base model")
parser.add_argument('--load_8bit', default="True", type=bool,
                    help='only use CPU for inference')

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████| 3/3 [00:15<00:00, 5.12s/it] Question: 给我写一个用户登录注册系统,前端用vue,后端用go,数据库用mysql设计,写出代码。 This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (2048). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

请问输出长度是有限制吗?但是感觉2048是不是太短了,怎么能修改这个长度呢?

@little51
Copy link
Contributor

little51 commented Nov 3, 2023

与微调的block_size参数有关,2048已够长了。如果需要更长的回答,可传入历史多轮对话

@MarsMeng1994
Copy link
Author

MarsMeng1994 commented Nov 3, 2023

与微调的block_size参数有关,2048已够长了。如果需要更长的回答,可传入历史多轮对话

但是,用decode超过2048就会乱生成,没检测到结束符就会一直生成,直接内存就爆了。
有达到最大长度自动停止的配置吗?

@MarsMeng1994
Copy link
Author

MarsMeng1994 commented Nov 3, 2023

与微调的block_size参数有关,2048已够长了。如果需要更长的回答,可传入历史多轮对话

您说的传入历史是指将上一步没生成完的输出当做输入再送一遍吗?

@little51
Copy link
Contributor

little51 commented Nov 3, 2023

有可能没有结束符,这个也没什么好办法,将上次输入、输出放到history入参里,本次提示词用“继续”,这个例子是用Llama-2-7b-chat微调,效果一般。后来我在原始模型上微调过一次,效果比这个好一些。https://github.com/git-cloner/Llama2-chinese

@MarsMeng1994
Copy link
Author

MarsMeng1994 commented Nov 3, 2023

有可能没有结束符,这个也没什么好办法,将上次输入、输出放到history入参里,本次提示词用“继续”,这个例子是用Llama-2-7b-chat微调,效果一般。后来我在原始模型上微调过一次,效果比这个好一些。https://github.com/git-cloner/Llama2-chinese

个人理解,history的长度也算在2048内,他只是拼接到当前的输入前面了。如果上一步超了,下一步也生成不出来吧
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants