Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fp16的支持问题 #41

Open
XUWeijiang opened this issue Oct 9, 2023 · 1 comment
Open

fp16的支持问题 #41

XUWeijiang opened this issue Oct 9, 2023 · 1 comment

Comments

@XUWeijiang
Copy link

因为现在手头只有v100的机器,所以训练的时候尝试用了fp16(bf16有点慢)。

但是发现用fp16实质上似乎没有训练,

这一行判断一直为True,也就是找到了inf/nan,导致训练不下去。

同样的数据集bf16的情况我跑过,没有这个问题。我也修改--initial-loss-scale到一个比较小的值也不行。

@li-yi-dong
Copy link
Collaborator

抱歉,fp16 验证的比较少,我们近期会看看。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants