You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When applying LoRA fine-tuning on the s3prl frontend (e.g. hubert_base), the output has no gradients. More specifically, I simply used the last layer, instead of using the multi-layer weighted sum.
Basic environments:
OS information: Linux 4.18.0-372.32.1.el8_6.x86_64 Updated sphinx documents #1 SMP Fri Oct 28 15:56:52 EDT 2022 x86_64
Traceback (most recent call last):
File "/miniconda/envs/espnet/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/miniconda/envs/espnet/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/espnet/espnet2/bin/asr_train.py", line 23, in <module>
main()
File "/espnet/espnet2/bin/asr_train.py", line 19, in main
ASRTask.main(cmd=cmd)
File "/espnet/espnet2/tasks/abs_task.py", line 1132, in main
cls.main_worker(args)
File "/espnet/espnet2/tasks/abs_task.py", line 1447, in main_worker
cls.trainer.run(
File "/espnet/espnet2/train/trainer.py", line 317, in run
all_steps_are_invalid = cls.train_one_epoch(
File "/espnet/espnet2/train/trainer.py", line 684, in train_one_epoch
scaler.scale(loss).backward()
File "/miniconda/envs/espnet/lib/python3.9/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/miniconda/envs/espnet/lib/python3.9/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
The text was updated successfully, but these errors were encountered:
Sorry for the late reply.
This bug occurs due to the attention implementation in s3prl frontend. Specifically, it uses torch.nn.functional.multi_head_attention_forward., which only uses the q_proj.weight.
So currently, please avoid using lora with s3prl frontend. An alternatives for lora could be the houlsby adapter.
We will send a PR to prevent users from using lora with s3prl frontend soon.
Thank you for reporting this bug.
Describe the bug
When applying LoRA fine-tuning on the s3prl frontend (e.g. hubert_base), the output has no gradients. More specifically, I simply used the last layer, instead of using the multi-layer weighted sum.
Basic environments:
3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0]
espnet 202402
pytorch 2.1.0
ec9760b22654dc04eeecd37e2659ebda0325a786
Sat Mar 2 01:25:58 2024 -0500
Environments from
torch.utils.collect_env
:e.g.,
Task information:
To Reproduce
Steps to reproduce the behavior:
cd egs2/librispeech_100/asr1
Error logs
The text was updated successfully, but these errors were encountered: