We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
求助关于显存的问题,我用xtuner的qlora配置用24G的显存是可以执行微调的,但是我希望尝试不量化的lora微调就是把关于量化的配置删掉后显示lora微调的显存不够是为什么呀,我采用chatglm3的官方lora微调24G是够的呀,后来我尝试利用双卡微调也是显示不够,是不是双卡不能叠加GPU而是只能加快速度呀,求助前辈们应该如何进行不量化的lora呢
The text was updated successfully, but these errors were encountered:
lora 微调的好处就是优化器状态非常少,训练参数也少。在开启 deepspeed 情况下,优化器状态是会切分到多卡的,所以显存占用会变少,但是少的不多。如果你双卡 lora 还是 OOM,可以考虑用 qlora 或者 qlora+zero3
Sorry, something went wrong.
前辈不知道是不是我指令的问题,我是用的 我可以尝试更多的的卡,是不是卡数够多就可以lora 啊
你训练命令不对,你这个训练方法是 dpp,并不是用 deepspeed,单卡的话命令是对的。正确命令要加上 --deepspeed deepspeed_zero2 或者 --deepspeed deepspeed_zero3
No branches or pull requests
求助关于显存的问题,我用xtuner的qlora配置用24G的显存是可以执行微调的,但是我希望尝试不量化的lora微调就是把关于量化的配置删掉后显示lora微调的显存不够是为什么呀,我采用chatglm3的官方lora微调24G是够的呀,后来我尝试利用双卡微调也是显示不够,是不是双卡不能叠加GPU而是只能加快速度呀,求助前辈们应该如何进行不量化的lora呢
The text was updated successfully, but these errors were encountered: