You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure if I understand the Induced Set Attention Block correctly.
So basically SAB is a transformer without positional encoding (and dropout?). In the paper, you said that SAB is "too expensive for large sets". But set size here refers to the max sequence length in a transformer which is usually 512. Why not just use the SAB for SetTransformer? Is there any reason other than efficiency, to use ISAB for SetTransformer?
The text was updated successfully, but these errors were encountered:
Not sure if I understand the Induced Set Attention Block correctly.
So basically SAB is a transformer without positional encoding (and dropout?). In the paper, you said that SAB is "too expensive for large sets". But set size here refers to the max sequence length in a transformer which is usually 512. Why not just use the SAB for SetTransformer? Is there any reason other than efficiency, to use ISAB for SetTransformer?
The text was updated successfully, but these errors were encountered: