Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsloth optims for Llama #1609

Merged
merged 4 commits into from
May 20, 2024
Merged

Unsloth optims for Llama #1609

merged 4 commits into from
May 20, 2024

Conversation

winglian
Copy link
Collaborator

@winglian winglian commented May 11, 2024

WIP to integrate Unsloth's optimizations into axolotl.

The manual autograd for MLP, QKV, O only seems to help VRAM by 1% as opposed to the reported 8%.
The Cross Entropy Loss does help significantly, but only reduced VRAM by 13% as opposed to the reported 17%.

edit: clarification, the cross entropy loss works for both full fine tunes and lora. The MLP, QKV, and O are only for 4-bit qlora with flash attention.

@bratao
Copy link

bratao commented May 11, 2024

The cross_entropy_loss optimization is applicable even in a full fine tune, right?

@winglian
Copy link
Collaborator Author

The cross_entropy_loss optimization is applicable even in a full fine tune, right?

Correct!

@linux-leo
Copy link

Are these optimizations compatible with flash attention? (Complete Noob here)

@winglian winglian changed the title Unsloth optims Unsloth optims for Llama May 14, 2024
@bratao
Copy link

bratao commented May 17, 2024

It is possible to alter to also patch Qwen? It was added to unsloth and all optimizations work for it:
unslothai/unsloth#428

@winglian
Copy link
Collaborator Author

It is possible to alter to also patch Qwen?

Let's tackle that in a follow up PR

@winglian winglian merged commit 8a1572a into main May 20, 2024
7 checks passed
@winglian winglian deleted the unsloth-optims branch May 20, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants