Inaccurate Mult-Adds Estimation for Transformers #226

Yiming-M · 2023-02-13T15:33:15Z

Describe the bug

For ViT, the returned total mult-adds from torchinfo.summary is much smaller than that reported in other websites.

To Reproduce

Code snippet:

from torchinfo import summary
from torchvision.models import vit_b_16
vit = vit_b_16()
input_size = 1, 3, 224, 224
summary(vit, input_size)

Output:

...
Total params: 86,567,656
Trainable params: 86,567,656
Non-trainable params: 0
Total mult-adds (M): 173.23
===============================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 104.09
Params size (MB): 232.27
Estimated Total Size (MB): 336.96

Expected behavior

From other resources such as MMClassification and PapersWithCode, the number of flops is 33.03G. I understand that the number of mult-adds is different than the number of flops, but in the case of transformers, where matrix multiplication accounts for a large proportion of overall computation, these two numbers should be similar (not like 33.03G and 173.23M!)

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: macOS Ventura 13.2 (M1Pro)
Python: 3.10.9
Package Version (torchinfo): 1.7.2

The text was updated successfully, but these errors were encountered:

hellcer · 2023-02-21T11:45:46Z

I meet the same question and hope developers to pay attention to it, Thanks a lot.

quancs · 2023-03-16T11:54:26Z

encountered similar bug: The MACs of MultiheadAttention module doesn't get counted

snimu · 2023-03-16T16:21:27Z

The problem is that currently, torchinfo only traces nn.Modules, not functions. Transformer Modules often use shortcut functions, so they often don't get traced.

Discussion #192 proposes a tracing mechanism that would fix this issue, but it is a big change. If anyone is up to implementing the change, I think that @TylerYep would be happy about it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate Mult-Adds Estimation for Transformers #226

Inaccurate Mult-Adds Estimation for Transformers #226

Yiming-M commented Feb 13, 2023

hellcer commented Feb 21, 2023

quancs commented Mar 16, 2023

snimu commented Mar 16, 2023

Inaccurate Mult-Adds Estimation for Transformers #226

Inaccurate Mult-Adds Estimation for Transformers #226

Comments

Yiming-M commented Feb 13, 2023

hellcer commented Feb 21, 2023

quancs commented Mar 16, 2023

snimu commented Mar 16, 2023