Add LoRA only options #5726

Stanwang1210 · 2024-04-01T08:05:08Z

What?

I add an option lora_only for enabling lora.mark_only_lora_as_trainable, which freezes all parameters except for LoRA layers. The default value of lora_only is set to True.

Why?

As suggested by @ftshijt , it's better for us to provide this option.

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 39.89%. Comparing base (3858d84) to head (c584334).

Files	Patch %	Lines
espnet2/layers/create_adapter_fn.py	25.00%	3 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #5726       +/-   ##
===========================================
- Coverage   71.58%   39.89%   -31.69%     
===========================================
  Files         760      760               
  Lines       69841    69845        +4     
===========================================
- Hits        49996    27865    -22131     
- Misses      19845    41980    +22135

Flag	Coverage Δ
test_integration_espnet1	`62.92% <ø> (ø)`
test_integration_espnetez	`27.98% <25.00%> (-0.01%)`	⬇️
test_python_espnet1	`18.20% <0.00%> (-0.01%)`	⬇️
test_python_espnet2	`?`
test_python_espnetez	`13.95% <25.00%> (+<0.01%)`	⬆️
test_utils	`20.91% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ftshijt · 2024-04-01T08:19:42Z

@simpleoier Hi, could you please have a check to see if your concern on the default behavior is fixed by this PR?

simpleoier · 2024-04-01T14:34:55Z

Sorry, I may not fully understand the purpose of this PR.
This PR only adds a warning for s3prl frontend. How about replacing it with some assert checks? In that case, the training stops if LoRA is applied on s3prl frontend.
I'm afraid not all of us would read the logging information carefully.

Also, my original concern was that LoRA could not be used for HuBERT / WavLM / Wav2vec2 models in the current codebase.

Stanwang1210 · 2024-04-01T16:08:51Z

This PR is only to support the option of enabling lora.mark_only_lora_as_trainable. It is related to the dependency of espnetez.
For the issue issue #5721, we will create another PR to fix it.

simpleoier · 2024-04-03T20:39:15Z

This PR is only to support the option of enabling lora.mark_only_lora_as_trainable

OK, now I understand the point. Thanks for the explanation.
If I understand it correctly, here if lora_only is set to False (before this PR), other parameters would be optimized together, right? This sounds a little bit weird to me, as LoRA is meant to be a PEFT method.
My solution to add other trainable parameters is that if some parameters need to be optimized during training, it is better to add another process for other_trainable_parameters. It is more flexible. What do you think?

@simpleoier Hi, could you please have a check to see if your concern on the default behavior is fixed by this PR?

@ftshijt I was trying to recall that. Maybe my previous concern was that the previous configs (maybe pretrained models) have set the arguments as use_lora: true. Now it is set to a new variable, named as

use_adapter: true
adapter: lora
adapter_conf: ...

It would be better to update those configs if there are any.

Stanwang1210 · 2024-04-04T11:48:35Z

@simpleoier
First, your understanding toward lora_only is correct.
However, I think that it's better to use freeze_param option to control which groups of parameters should be frozen or not. Simply freezing all other parameters when employing PEFT methods may cause the issue of unexpectedly freezing downstream models.
Therefore, I prefer the pipeline that create_adapter_fn only adds adapter modules without modifying the requires_grad of parameters.

For the other_trainable_parameters, it somehow overlaps with the freeze_param option because they are basically the opposite of each other. In other words, those parameters who are not specified in freeze_param should be specified in other_trainable_parameters. Therefore, I think this design is a little bit redundant.
What do you think about this?

simpleoier · 2024-04-04T13:15:12Z

Hi @Stanwang1210 , using freeze_params is fine.

Stanwang1210 · 2024-04-04T16:40:27Z

So, do we have any further problems about this PR?

add lora only options

40e39b4

mergify bot added ESPnet2 README labels Apr 1, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

c584334

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LoRA only options #5726

Add LoRA only options #5726

Stanwang1210 commented Apr 1, 2024

codecov bot commented Apr 1, 2024 •

edited

ftshijt commented Apr 1, 2024

simpleoier commented Apr 1, 2024

Stanwang1210 commented Apr 1, 2024

simpleoier commented Apr 3, 2024

Stanwang1210 commented Apr 4, 2024 •

edited

simpleoier commented Apr 4, 2024

Stanwang1210 commented Apr 4, 2024

Add LoRA only options #5726

Are you sure you want to change the base?

Add LoRA only options #5726

Conversation

Stanwang1210 commented Apr 1, 2024

What?

Why?

See also

codecov bot commented Apr 1, 2024 • edited

Codecov Report

ftshijt commented Apr 1, 2024

simpleoier commented Apr 1, 2024

Stanwang1210 commented Apr 1, 2024

simpleoier commented Apr 3, 2024

Stanwang1210 commented Apr 4, 2024 • edited

simpleoier commented Apr 4, 2024

Stanwang1210 commented Apr 4, 2024

codecov bot commented Apr 1, 2024 •

edited

Stanwang1210 commented Apr 4, 2024 •

edited