Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于人类偏好模型的训练 #47

Open
dongxqm opened this issue Jul 19, 2023 · 2 comments
Open

关于人类偏好模型的训练 #47

dongxqm opened this issue Jul 19, 2023 · 2 comments

Comments

@dongxqm
Copy link

dongxqm commented Jul 19, 2023

您好,看到论文里写的最后的对比训练用的是,一个线性层做的一个打分排序模型?请问这一步是不是没有用的强化学习

@hanyullai
Copy link
Contributor

是的,我们目前还没有使用强化学习用于我们的模型训练中,人类偏好模型目前仅用于模型回答的筛选。

@webdxq
Copy link

webdxq commented Jul 24, 2023

好的,感谢您的回答

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants