Integrate LucidRain's RotaryEmbeddings #621

suchenzang · 2023-01-27T07:37:58Z

See https://github.com/lucidrains/rotary-embedding-torch/blob/main/rotary_embedding_torch/rotary_embedding_torch.py

And from PaLM paper:

We use RoPE embeddings (Su et al., 2021) rather than absolute or relative position embeddings, since RoPE embeddings have been shown to have better performance on long sequence lengths.

erip · 2023-01-27T19:05:38Z

Since you already have xformers as a soft dependency, you should be able to pull it in directly from there if it's installed (similar to flash attention).

suchenzang · 2023-01-27T20:36:03Z

It would be nice to have more configurability via:
https://github.com/lucidrains/rotary-embedding-torch/blob/6868f6ff30898989e4aa5890973911b2edc5e8d8/rotary_embedding_torch/rotary_embedding_torch.py#L60-L72

vs

https://github.com/facebookresearch/xformers/blob/bc08bbc631348913a3c37b4e09832973ff93a398/xformers/components/positional_embedding/rotary.py#L49-L57

suchenzang added enhancement New feature or request good first issue Good for newcomers labels Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate LucidRain's RotaryEmbeddings #621

Integrate LucidRain's RotaryEmbeddings #621

suchenzang commented Jan 27, 2023

erip commented Jan 27, 2023 •

edited

suchenzang commented Jan 27, 2023

Integrate LucidRain's RotaryEmbeddings #621

Integrate LucidRain's RotaryEmbeddings #621

Comments

suchenzang commented Jan 27, 2023

erip commented Jan 27, 2023 • edited

suchenzang commented Jan 27, 2023

erip commented Jan 27, 2023 •

edited