transformations in MiniViT paper #224

We make several modifi�cations on DeiT: First, we remove the [class] token. The
model is attached with a global average pooling layer and a
fully-connected layer for image classification. We also utilize relative position encoding to introduce inductive bias to
boost the model convergence [52,59]. Finally, based on our
observation that transformation for FFN only brings limited
performance gains in DeiT, we remove the block to speed up
both training and inference.

-> Does this mean that in MiniDeiT model, IRPE is utilized (for the value), and the MLP transformation is removed, leaving only the attention transformation?

wkcn · 2024-02-23T05:20:09Z

On the MiniViT paper,

We make several modifi�cations on DeiT: First, we remove the [class] token. The model is attached with a global average pooling layer and a fully-connected layer for image classification. We also utilize relative position encoding to introduce inductive bias to boost the model convergence [52,59]. Finally, based on our observation that transformation for FFN only brings limited performance gains in DeiT, we remove the block to speed up both training and inference.

-> Does this mean that in MiniDeiT model, IRPE is utilized (for the value), and the MLP transformation is removed, leaving only the attention transformation?

Yes. I correct my statement. There is no transformation for FFN in Mini-DeiT. iRPE is utilized for only the key.

Cream/MiniViT/Mini-DeiT/mini_vision_transformer.py

Line 97 in 4a13c40

attn += self.rpe_k(q)

Cream/MiniViT/Mini-DeiT/mini_deit_models.py

Line 17 in 4a13c40

rpe_on='k',

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformations in MiniViT paper #224

transformations in MiniViT paper #224

gudrb commented Feb 22, 2024

wkcn commented Feb 23, 2024

gudrb commented Feb 23, 2024

wkcn commented Feb 23, 2024

transformations in MiniViT paper #224

transformations in MiniViT paper #224

Comments

gudrb commented Feb 22, 2024

wkcn commented Feb 23, 2024

gudrb commented Feb 23, 2024

wkcn commented Feb 23, 2024