Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

由上游PyTorch引入的问题 #72

Open
gemfield opened this issue Feb 9, 2021 · 0 comments
Open

由上游PyTorch引入的问题 #72

gemfield opened this issue Feb 9, 2021 · 0 comments

Comments

@gemfield
Copy link
Contributor

gemfield commented Feb 9, 2021

DeepVAC把这些问题划分为两类:

  • 阻塞性问题;
  • 可以绕过的问题。

阻塞性问题

  • 在DDP模式中,训练任务不支持再开启trace和script。解决方案:等待上游PyTorch添加新功能;
  • 量化感知训练(QAT)不支持图模式,因此需要手工修改网络,参考https://zhuanlan.zhihu.com/p/349019936 所述。解决方案:等待上游PyTorch添加新功能;
  • 开启script_model_dir + static_quantize_dir得到的量化模型,在运行时报错(trace_model_dir + static_quantize_dir似乎没有问题)。解决方案:等待上游PyTorch的fix;
  • 图模式量化下,emit upsample的问题;

可以绕过的问题

  • 静态库没有安装到install目录下的问题;
  • nccl_static、kineto库的问题;
  • 静态编译下,导出变量不能包含cuda共享库的问题;
@gemfield gemfield changed the title 由上游PyTorch引入的bug汇总 由上游PyTorch引入的问题 Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant