You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have to add some LoRA layers by hand(without left) to a pre-trained Multi-modal model, to finetune the model for new data. I want Deepspeed to optimize ONLY the parameters from the LoRA layer rather than all the parameters. Like this
The platform is hugging face's transformers and Deepspeed.
Therefore I decorate the Trainer from HF's transformers, as below:
Unfortunately, it doesn't work, both LoRA and non-LoRa's weights are not changed during training. It seems that the optimizer in Deepspeed is not the same as that from Pytorch.
My question is, are there any ways that allow me to ONLY finetune certain subnet (LoRA) parameters with Deepspeed+Transformer's Trainer?
The text was updated successfully, but these errors were encountered:
I have to add some LoRA layers by hand(without left) to a pre-trained Multi-modal model, to finetune the model for new data. I want Deepspeed to optimize ONLY the parameters from the LoRA layer rather than all the parameters. Like this
The platform is hugging face's transformers and Deepspeed.
Therefore I decorate the Trainer from HF's transformers, as below:
Unfortunately, it doesn't work, both LoRA and non-LoRa's weights are not changed during training. It seems that the optimizer in Deepspeed is not the same as that from Pytorch.
My question is, are there any ways that allow me to ONLY finetune certain subnet (LoRA) parameters with Deepspeed+Transformer's Trainer?
The text was updated successfully, but these errors were encountered: