Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

生成答案重复 #92

Open
loki1017 opened this issue Sep 1, 2023 · 2 comments
Open

生成答案重复 #92

loki1017 opened this issue Sep 1, 2023 · 2 comments

Comments

@loki1017
Copy link

loki1017 commented Sep 1, 2023

非常感谢您的贡献,我基于活字1.0进行了lora模型的复现工作,下面是我的复现结果:
image
image

我想请教您关于模型回答一直重复的问题(在temperature=1.0的情况下),我在进行其他模型训练的时候也经常遇到类似的问题,我想知道这个问题产生的具体原因是什么?是因为训练方式的原因,还是因为推理时参数设置的原因呢?万分感谢!!!

@cookie925
Copy link

我也有这个问题

@loki1017
Copy link
Author

loki1017 commented Sep 7, 2023

目前我也在探索重复的解决方案,有些许想法,如果有大佬知道,也请给出指正:

  1. 与数据集质量有关,模型在过拟合的情况下很容易产生重复内容,如果你的数据集数量少可以适当地扩充数据量。
  2. 一些模型在推理阶段也要参考固定的template,比如baichuan,llama
    推荐参考此项目:推理chinese-llama-2回答一直重复 hiyouga/LLaMA-Factory#473

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants