-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use sparse.mm in float16 training pipeline #5282
Comments
Hi! thanks for your contribution!, great first issue! |
--------- Update -------
on the top of the operations. |
isn't it "O1" for amp level? For torch native amp, see here a list of ops that can be autocast: |
Yes, "01" is for amp level. I have a CNN model to train, and there is one operation using sparse tensor in the forward loop. More specifically, the model has a The setting I mentioned of |
No, I'm saying it should be "O1" not "01". PL doesn't convert ops and tensors directly, it relies on either Apex or native torch amp. As you can see in the link I posted, sparse matrix mul is not a supported one (by torch native amp) |
When pytorch/pytorch#41069 gets implemented, Lightning will automatically support it. |
Oho, I see. Please excuse me. I do not know why I kept typing "01". It is "O1" for sure. |
okay, let me know if you run into more questions. |
What is your question?
How can we assign certain operation (e.g.
torch.sparse.mm
) as float32 operation in float16 training setting?Details and what I have tried
I am trying to train a model using
and I need to use sparse tensor multiplication in the forward loop. I got
RuntimeError: "addmm_sparse_cuda" not implemented for 'Half'
as reported in Pytorch issue #41069. However, this error remains even after I changed the variable type into float32.I guess the apex or pytorch-lightening is still calling the
sparse.mm
with float16 setting. Is it possible to assign certain operation in the float16 training pipeline as float32 operation? Or if there is any alternative way that I can usetorch.sparse.mm
within float16 training process.Reproduce
Initialize any model (e.g. the official MNIST demo), set
add following code in the
forward
functionI cannot afford to do
c= b.to_dense() @ a
in practice, because of the limited GPU memory.What's your environment?
The text was updated successfully, but these errors were encountered: