can you explain cross attention #13

yandun72 · 2022-10-20T17:01:13Z

Hi，I know TransformerEncoderLayer(C,4,C,0.5) C 4 C is the d_model n_head
and dim_feedforward meaning.

and x.unsqueeze(1) becomes N 1 C shape。

Because batch_first is false for transformer,so it will do self attention at batch dim,
but i am confused with what you said in the paper of cross attention. I cant read the cross attention in the pseudo code，can you give me a interpretation about it.By the way what if x shape is batch seq hidden_size？Because for NER task，its shape is that。 In this situation how to apply batchformer？hope for your sincere reply！

zhihou7 · 2022-10-21T04:56:57Z

Hi @yandun72. If the shape of x is (batch, seq, hidden_size), you can permute the shape as (seq, batch, hidden_size) or set batch_first=true.

Sorry for the description of cross-attention that is confusing you. In batchformer, we incorporate attention across the batch dimension. Therefore, the cross-attention is not specific attention, but transformer attention. We just want to emphasize the batch dimension. You can regard it as cross-batch attention.

Regards,

yandun72 · 2022-10-21T11:48:39Z

Hi @yandun72. If the shape of x is (batch, seq, hidden_size), you can permute the shape as (seq, batch, hidden_size) or set batch_first=true.

Sorry for the description of cross-attention that is confusing you. In batchformer, we incorporate attention across the batch dimension. Therefore, the cross-attention is not specific attention, but transformer attention. We just want to emphasize the batch dimension. You can regard it as cross-batch attention.

Regards,

Thanks for your reply！I have got it！

yandun72 changed the title ~~can you explain the shape of the common batchformer meaning~~ can you explain cross attention Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can you explain cross attention #13

can you explain cross attention #13

yandun72 commented Oct 20, 2022 •

edited

zhihou7 commented Oct 21, 2022 •

edited

yandun72 commented Oct 21, 2022

can you explain cross attention #13

can you explain cross attention #13

Comments

yandun72 commented Oct 20, 2022 • edited

zhihou7 commented Oct 21, 2022 • edited

yandun72 commented Oct 21, 2022

yandun72 commented Oct 20, 2022 •

edited

zhihou7 commented Oct 21, 2022 •

edited