Skip to content

v4.38.1

Compare
Choose a tag to compare
@ArthurZucker ArthurZucker released this 22 Feb 00:24
· 891 commits to main since this release

Fix eager attention in Gemma!

TLDR:

-        attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
+        attn_output = attn_output.view(bsz, q_len, -1)