You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use RoPE embeddings (Su et al., 2021) rather than absolute or relative position embeddings, since RoPE embeddings have been shown to have better performance on long sequence lengths.
The text was updated successfully, but these errors were encountered:
Since you already have xformers as a soft dependency, you should be able to pull it in directly from there if it's installed (similar to flash attention).
See https://github.com/lucidrains/rotary-embedding-torch/blob/main/rotary_embedding_torch/rotary_embedding_torch.py
And from PaLM paper:
The text was updated successfully, but these errors were encountered: