We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用镜像:nvcr.io/nvidia/pytorch:24.01-py3 单卡是没问题的 执行命令: bash examples/lora_multi_gpu/ds_zero3.sh(bash examples/lora_multi_gpu/single_node.sh的效果也一样)
yaml文件只修改了模型模型名称 bash文件只修改了进程数 config文件未做修改
处理完数据集后就不动了,也没有tokenizer的信息,accelerate和deepspeed的效果都一样的 对应nvdia-smi的状态
参考了https://github.com/hiyouga/LLaMA-Factory/issues/1683https://github.com/hiyouga/LLaMA-Factory/issues/1651https://github.com/hiyouga/LLaMA-Factory/issues/1135但是未解决
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
Reproduction
使用镜像:nvcr.io/nvidia/pytorch:24.01-py3
单卡是没问题的
执行命令:
bash examples/lora_multi_gpu/ds_zero3.sh(bash examples/lora_multi_gpu/single_node.sh的效果也一样)
yaml文件只修改了模型模型名称
bash文件只修改了进程数
config文件未做修改
Expected behavior
处理完数据集后就不动了,也没有tokenizer的信息,accelerate和deepspeed的效果都一样的
对应nvdia-smi的状态
System Info
Others
参考了https://github.com/hiyouga/LLaMA-Factory/issues/1683https://github.com/hiyouga/LLaMA-Factory/issues/1651https://github.com/hiyouga/LLaMA-Factory/issues/1135但是未解决
The text was updated successfully, but these errors were encountered: