Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to finetune certain portion of the whole parameter #5487

Open
JasonLeeFdu opened this issue Apr 30, 2024 · 0 comments
Open

How to finetune certain portion of the whole parameter #5487

JasonLeeFdu opened this issue Apr 30, 2024 · 0 comments

Comments

@JasonLeeFdu
Copy link

I have to add some LoRA layers by hand(without left) to a pre-trained Multi-modal model, to finetune the model for new data. I want Deepspeed to optimize ONLY the parameters from the LoRA layer rather than all the parameters. Like this
image

The platform is hugging face's transformers and Deepspeed.

Therefore I decorate the Trainer from HF's transformers, as below:
image

Unfortunately, it doesn't work, both LoRA and non-LoRa's weights are not changed during training. It seems that the optimizer in Deepspeed is not the same as that from Pytorch.

My question is, are there any ways that allow me to ONLY finetune certain subnet (LoRA) parameters with Deepspeed+Transformer's Trainer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants