A problem in the PositionalEncoding model code #42

kir1to455 · 2024-04-29T09:21:52Z

Hi，
Thank you for developing Corigami !
I have encountered some problems when I use corigami to train my data.

After the encoder step，here, the transposed matrix is input into attn.
the matrix x is : Tensor, shape [batch_size, seq_lenth, embedding_dim], not Tensor, shape [seq_lenth, batch_size, embedding_dim] !

if perform this step: x = x + self.pe[:x.size(0)] will return the wrong location information result.

I think the code may have made an error in transponse after the encoder.

Best wishes,
Kirtio

tanjimin · 2024-04-30T17:18:54Z

Hi @kir1to455 Thank you for raising this issue!

The move feature_forward function is actually for adjusting the difference between CNN vs Transformer. CNN will have channels/hiddens as the second feature [batch, hidden, length] and NLP models (including transformer) usually have hidden at the end [batch, length, hidden]. We need to swap the last two axis back and forth.

I think you are correct about the PE which is a separate issue. One simple fix could be change the dimension of the pe loaded and let pytorch handle the broad casting. You can try this:

x = x + self.pe[:x.size(1)].transpose(0, 1)

I think the decoder compensated for the pe issue, and I'll definitely fix this in the next version! Let me know if you have other questions.

Jimin

kir1to455 · 2024-05-11T02:54:55Z

Hi, Jimin @tanjimin
Thank you for your reply.
Another question I have is why decoder is used in this article instead of transformerdecoder?
In Decoder, you use Dilated Convolution to get more receptive field.
But why not use transformerdecoder?

Best wishes,
Kirito

tanjimin · 2024-05-17T15:39:20Z

Hi @kir1to455 , the decoder here is actually a dilated 2D conv resnet. It is very different from the typical transformer unless you consider ViT. I named it decoder because it follows the encoder-decoder architecture and decodes genomic features to the final Hi-C map.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A problem in the PositionalEncoding model code #42

A problem in the PositionalEncoding model code #42

kir1to455 commented Apr 29, 2024

tanjimin commented Apr 30, 2024

kir1to455 commented May 11, 2024

tanjimin commented May 17, 2024

A problem in the PositionalEncoding model code #42

A problem in the PositionalEncoding model code #42

Comments

kir1to455 commented Apr 29, 2024

tanjimin commented Apr 30, 2024

kir1to455 commented May 11, 2024

tanjimin commented May 17, 2024