Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LoRA only options #5726

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

Stanwang1210
Copy link
Contributor

What?

I add an option lora_only for enabling lora.mark_only_lora_as_trainable, which freezes all parameters except for LoRA layers. The default value of lora_only is set to True.

Why?

As suggested by @ftshijt , it's better for us to provide this option.

See also

Copy link

codecov bot commented Apr 1, 2024

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 39.89%. Comparing base (3858d84) to head (c584334).

Files Patch % Lines
espnet2/layers/create_adapter_fn.py 25.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #5726       +/-   ##
===========================================
- Coverage   71.58%   39.89%   -31.69%     
===========================================
  Files         760      760               
  Lines       69841    69845        +4     
===========================================
- Hits        49996    27865    -22131     
- Misses      19845    41980    +22135     
Flag Coverage Δ
test_integration_espnet1 62.92% <ø> (ø)
test_integration_espnetez 27.98% <25.00%> (-0.01%) ⬇️
test_python_espnet1 18.20% <0.00%> (-0.01%) ⬇️
test_python_espnet2 ?
test_python_espnetez 13.95% <25.00%> (+<0.01%) ⬆️
test_utils 20.91% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ftshijt
Copy link
Collaborator

ftshijt commented Apr 1, 2024

@simpleoier Hi, could you please have a check to see if your concern on the default behavior is fixed by this PR?

@simpleoier
Copy link
Collaborator

Sorry, I may not fully understand the purpose of this PR.
This PR only adds a warning for s3prl frontend. How about replacing it with some assert checks? In that case, the training stops if LoRA is applied on s3prl frontend.
I'm afraid not all of us would read the logging information carefully.

Also, my original concern was that LoRA could not be used for HuBERT / WavLM / Wav2vec2 models in the current codebase.

@Stanwang1210
Copy link
Contributor Author

This PR is only to support the option of enabling lora.mark_only_lora_as_trainable. It is related to the dependency of espnetez.
For the issue issue #5721, we will create another PR to fix it.

@simpleoier
Copy link
Collaborator

This PR is only to support the option of enabling lora.mark_only_lora_as_trainable

OK, now I understand the point. Thanks for the explanation.
If I understand it correctly, here if lora_only is set to False (before this PR), other parameters would be optimized together, right? This sounds a little bit weird to me, as LoRA is meant to be a PEFT method.
My solution to add other trainable parameters is that if some parameters need to be optimized during training, it is better to add another process for other_trainable_parameters. It is more flexible. What do you think?

@simpleoier Hi, could you please have a check to see if your concern on the default behavior is fixed by this PR?

@ftshijt I was trying to recall that. Maybe my previous concern was that the previous configs (maybe pretrained models) have set the arguments as use_lora: true. Now it is set to a new variable, named as

use_adapter: true
adapter: lora
adapter_conf: ...

It would be better to update those configs if there are any.

@Stanwang1210
Copy link
Contributor Author

Stanwang1210 commented Apr 4, 2024

@simpleoier
First, your understanding toward lora_only is correct.
However, I think that it's better to use freeze_param option to control which groups of parameters should be frozen or not. Simply freezing all other parameters when employing PEFT methods may cause the issue of unexpectedly freezing downstream models.
Therefore, I prefer the pipeline that create_adapter_fn only adds adapter modules without modifying the requires_grad of parameters.

For the other_trainable_parameters, it somehow overlaps with the freeze_param option because they are basically the opposite of each other. In other words, those parameters who are not specified in freeze_param should be specified in other_trainable_parameters. Therefore, I think this design is a little bit redundant.
What do you think about this?

@simpleoier
Copy link
Collaborator

Hi @Stanwang1210 , using freeze_params is fine.

@Stanwang1210
Copy link
Contributor Author

So, do we have any further problems about this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants