You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TRT tries both sparse and dense tactics and choose a faster one. Based on our experiments, sparse conv kernels are faster than dense conv kernels if C and K are large enough (>256). Could you try increase C and K for the 1x1 Conv and see if sparse conv tactic is chosen?
3x3 Conv effectively increases the C by 9x so it favors sparse kernels.
I use nsys to profile the resnet34.
And I find that only 3x3 conv use 2:4 sparsity, while 1x1 conv does not. (Furthermore, I find that Linear in transformers also does not use sparsity.)
Why?
The text was updated successfully, but these errors were encountered: