Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient Checkpointing #549

Open
fakerybakery opened this issue Apr 18, 2024 · 2 comments
Open

Gradient Checkpointing #549

fakerybakery opened this issue Apr 18, 2024 · 2 comments
Labels
type/feature An issue or pull request that introduces a new feature

Comments

@fakerybakery
Copy link

fakerybakery commented Apr 18, 2024

Hi, I'm trying to finetune OLMo but running into the error ValueError: OLMoForCausalLM does not support gradient checkpointing. Is this planned in the future?

Thanks for releasing OLMo!

@fakerybakery fakerybakery added the type/feature An issue or pull request that introduces a new feature label Apr 18, 2024
@2015aroras
Copy link
Contributor

We just released OLMo integration into the transformers library (v4.40.0 and up), with corresponding -hf checkpoints on Huggingface Hub (e.g. https://huggingface.co/allenai/OLMo-1.7-7B-hf). I haven't tried gradient checkpointing there, but it may work.

@bdytx5
Copy link

bdytx5 commented May 20, 2024

I confirmed it does not work. This would a great addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature An issue or pull request that introduces a new feature
Projects
None yet
Development

No branches or pull requests

3 participants