Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] How to finetune ONLY certain subset of the network parameters #5486

Open
JasonLeeFdu opened this issue Apr 30, 2024 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@JasonLeeFdu
Copy link

JasonLeeFdu commented Apr 30, 2024

I have to add some LoRA layers by hand(without left) to a pre-trained Multi-modal model, to finetune the model for new data. I want Deepspeed to optimize ONLY the parameters from the LoRA layer rather than all the parameters. Like this
image

The platform is hugging face's transformers and Deepspeed.

Therefore I decorate the Trainer from HF's transformers, as below:
image

Unfortunately, it doesn't work, both LoRA and non-LoRa's weights are not changed during training. It seems that the optimizer in Deepspeed is not the same as that from Pytorch.

My question is, are there any ways that allow me to ONLY finetune certain subnet (LoRA) parameters with Deepspeed+Transformer's Trainer?

@JasonLeeFdu JasonLeeFdu added the enhancement New feature or request label Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant