New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Llama config to use Llama block and RoPE lower precision #358
base: main
Are you sure you want to change the base?
Conversation
This is good, but I want to run that separately first to see if it makes a difference. We'll have to wait a bit to get cluster time. |
Do you want a separate config then, or should I just keep this PR on hold until you're ready to take it? |
Let's keep this on hold for a bit, but if it gets too long, we'll merge it as a separate config. |
I just ran this on Beaker, and it said this:
coming from Can you find some place to run this to avoid errors like this, even if it's a tiny batch and sequence length, just for a few batches? |
That runtime error is now fixed on a local hackish setup I have. I'll try running it on beaker briefly to see if anything else shows up. |
This seems to run fine on beaker (with reduced model size to adjust for lack of GPUs). |
Updating the Llama config to use Llama block and RoPE lower precision, to match the behavior of bf16-autocast Llama more closely.