Gradient checkpointing issue when running QLoRA finetuning #413

tytung2020 · 2023-07-01T15:09:02Z

Finetuning the mpt-7b and mpt-30b using qlora gives the error "ValueError: MPTForCausalLM does not support gradient checkpointing.". Is there a way to fix this?

tytung2020 · 2023-07-12T04:47:04Z

are these lines of codes what is needed to make it work? cekal's amendment seems to work on the 7b version:
https://huggingface.co/cekal/mpt-7b-peft-compatible/commit/a5eab52c1c61c1d50a4e01428949f6ff90c73c48
But not sure if it works fully as intended. Could someone in MosaicML check this?
If so, please also implement this in the 30b version. Thanks~

tytung2020 added the question Further information is requested label Jul 1, 2023

vchiley assigned codestar12 and danbider Jul 1, 2023

minionKP mentioned this issue Jul 3, 2023

LORA + FSDP: ValueError: FlatParameter requires uniform requires_grad #414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient checkpointing issue when running QLoRA finetuning #413

Gradient checkpointing issue when running QLoRA finetuning #413

tytung2020 commented Jul 1, 2023 •

edited

tytung2020 commented Jul 12, 2023 •

edited

Gradient checkpointing issue when running QLoRA finetuning #413

Gradient checkpointing issue when running QLoRA finetuning #413

Comments

tytung2020 commented Jul 1, 2023 • edited

tytung2020 commented Jul 12, 2023 • edited

tytung2020 commented Jul 1, 2023 •

edited

tytung2020 commented Jul 12, 2023 •

edited