Example Doubt #19683

emi-dm · 2024-05-07T16:43:58Z

Can someone explain me why CLS token is not included in this example and how I could include it for any backend?https://keras.io/examples/vision/image_classification_with_vision_transformer/

sineeli · 2024-05-08T20:38:50Z

Hi @emi-dm,

This design is inherited from the Transformer model for text, and we use it throughout the main
paper. An initial attempt at using only image-patch embeddings, globally average-pooling (GAP)
them, followed by a linear classifier—just like ResNet’s final feature map—performed very poorly.
However, we found that this is neither due to the extra token, nor to the GAP operation. Instead  
the difference in performance is fully explained by the requirement for a different learning-rate

Taken from ViT paper

with CLS and without CLS ViT can be constructed as per the paper. In case you want to use CLS token create a extra token embedding of ViT hidden dimension(d_model) and prepend to the Porojected Patches.

The attached new embedding can be considered as a separate single keras layer with a weight vector and this can work with all backends.

Example

class TokenLayer(keras.layers.Layer):
    
    def build(self, input_shape):
        self.cls_token = self.add_weight(
            name='cls',
            shape=(1, 1, input_shape[-1]),
            initializer='zeros'
        )
    
    def call(self, inputs):
        cls_token = self.cls_token + keras.ops.zeros_like(inputs[:, 0:1]) 
        out = keras.layers.Concatenate(axis=1)([cls_token, inputs])
        
        return out

Thanks and hope this helps.

emi-dm · 2024-05-09T08:54:43Z

Thank you so much @sineeli!!! I couldn't dept the necessary into the original paper, so this caused my doubt! Really appreciated :)

google-ml-butler · 2024-05-09T08:54:45Z

Are you satisfied with the resolution of your issue?
Yes
No

github-actions bot assigned sachinprasadhs May 7, 2024

sachinprasadhs added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting response from contributor labels May 8, 2024

emi-dm closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example Doubt #19683

Example Doubt #19683

emi-dm commented May 7, 2024 •

edited

sineeli commented May 8, 2024 •

edited

emi-dm commented May 9, 2024

google-ml-butler bot commented May 9, 2024

Example Doubt #19683

Example Doubt #19683

Comments

emi-dm commented May 7, 2024 • edited

sineeli commented May 8, 2024 • edited

emi-dm commented May 9, 2024

google-ml-butler bot commented May 9, 2024

emi-dm commented May 7, 2024 •

edited

sineeli commented May 8, 2024 •

edited