Update Llama config to use Llama block and RoPE lower precision #358

2015aroras · 2023-11-02T18:47:51Z

Updating the Llama config to use Llama block and RoPE lower precision, to match the behavior of bf16-autocast Llama more closely.

dirkgr · 2023-11-02T18:50:01Z

This is good, but I want to run that separately first to see if it makes a difference. We'll have to wait a bit to get cluster time.

2015aroras · 2023-11-02T18:55:31Z

This is good, but I want to run that separately first to see if it makes a difference. We'll have to wait a bit to get cluster time.

Do you want a separate config then, or should I just keep this PR on hold until you're ready to take it?

dirkgr · 2023-11-02T18:59:56Z

Let's keep this on hold for a bit, but if it gets too long, we'll merge it as a separate config.

dirkgr · 2023-11-02T22:21:42Z

I just ran this on Beaker, and it said this:

RuntimeError: When using the full_megatron init, every module must have a type.

coming from /home/dirkg/LLM/olmo/model.py:826.

Can you find some place to run this to avoid errors like this, even if it's a tiny batch and sequence length, just for a few batches?

2015aroras · 2023-11-02T23:31:32Z

I just ran this on Beaker, and it said this:
RuntimeError: When using the full_megatron init, every module must have a type.
coming from /home/dirkg/LLM/olmo/model.py:826.

Can you find some place to run this to avoid errors like this, even if it's a tiny batch and sequence length, just for a few batches?

That runtime error is now fixed on a local hackish setup I have. I'll try running it on beaker briefly to see if anything else shows up.

2015aroras · 2023-11-03T00:54:47Z

This seems to run fine on beaker (with reduced model size to adjust for lack of GPUs).

Update Llama config to use Llama block and RoPE lower precision

e6face9

2015aroras requested a review from dirkgr November 2, 2023 18:48

Merge branch 'main' into shanea/update-llama-config

1cf7c74

dirkgr and others added 4 commits November 2, 2023 15:35

Bools in this weird yaml language are lowercase

a3fb75c

Fix capitalization of false in llama7.yaml

11da503

Add module type to linear layers in Llama block

e594088

Run black

fb9ee46

Merge branch 'main' into shanea/update-llama-config

2e7215d

2015aroras mentioned this pull request Feb 13, 2024

Why does config/llama7.yaml not use OlmoLlamaBlock? #444

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Llama config to use Llama block and RoPE lower precision #358

Update Llama config to use Llama block and RoPE lower precision #358

2015aroras commented Nov 2, 2023

dirkgr commented Nov 2, 2023 •

edited

2015aroras commented Nov 2, 2023

dirkgr commented Nov 2, 2023

dirkgr commented Nov 2, 2023

2015aroras commented Nov 2, 2023

2015aroras commented Nov 3, 2023 •

edited

Update Llama config to use Llama block and RoPE lower precision #358

Are you sure you want to change the base?

Update Llama config to use Llama block and RoPE lower precision #358

Conversation

2015aroras commented Nov 2, 2023

dirkgr commented Nov 2, 2023 • edited

2015aroras commented Nov 2, 2023

dirkgr commented Nov 2, 2023

dirkgr commented Nov 2, 2023

2015aroras commented Nov 2, 2023

2015aroras commented Nov 3, 2023 • edited

dirkgr commented Nov 2, 2023 •

edited

2015aroras commented Nov 3, 2023 •

edited