Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoding in Hubert: RecursionError: maximum recursion depth exceeded in comparison #5480

Open
Charlottehoo opened this issue Apr 16, 2024 · 0 comments

Comments

@Charlottehoo
Copy link

Charlottehoo commented Apr 16, 2024

❓ Questions and Help

I am now working on Hubert model
I have pretrained a new model from Youtube dataset and finetuned the model using some animal sound files. The labels for each file are animal names, so I have created new dictionary and ltr, wrd. I set ltr and wrd to be same, the animal name labels.

I run 50 iterations for pretrain and 50 iterations for finetune

but when I decode test dataset using the checkpoint_best.pt , I have trouble with the following error.

What is your question?

File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 374, in init model = pretrain_task.build_model(w2v_args.model, from_checkpoint=True)
File "/export/home/chu/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model model = models.build_model(cfg, self, from_checkpoint)
File "/export/home/chu/fairseq/fairseq/models/init.py", line 106, in build_model return model.build_model(cfg, task)
File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 170, in build_model w2v_encoder = HubertEncoder(cfg, task)
File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 349, in init state = checkpoint_utils.load_checkpoint_to_cpu(cfg.w2v_path, arg_overrides)
File "/export/home/chu/fairseq/fairseq/checkpoint_utils.py", line 358, in load_checkpoint_to_cpu state["cfg"] = OmegaConf.create(state["cfg"])
RecursionError: maximum recursion depth exceeded in comparison full_key: job_logging_cfg.formatters.simple.format
reference_type=Any object_type=dict
full_key: job_logging_cfg.formatters.simple reference_type=Any
object_type=dict full_key: job_logging_cfg.formatters
reference_type=Any object_type=dict
full_key: job_logging_cfg reference_type=Optional[Dict[Union[str, Enum], Any]]
object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Code

python fairseq_cli/hydra_train.py
--config-dir ~/fairseq/examples/hubert/config/finetune
--config-name base_10h
task.data=/.../data/finetune_data
task.label_dir=/.../trans/finetune_trans
model.w2v_path=/.../fairseq/None/checkpoints/checkpoint_best.pt
dataset.skip_invalid_size_inputs_valid_test=true
checkpoint.save_interval=1
checkpoint.reset_optimizer=true

python examples/speech_recognition/new/infer.py
--config-dir ~/.../config/decode
--config-name infer_viterbi
task.data=/.../data
task.normalize=false
dataset.gen_subset=test
common_eval.path=/.../checkpoints/checkpoint_best.pt

####Config file

defaults:

  • model: null

hydra:
run:
dir: ${common_eval.results_path}/viterbi
sweep:
dir: ${common_eval.results_path}
subdir: viterbi

task:
_name: hubert_pretraining
single_target: true
fine_tuning: true
data: ???
normalize: ???

decoding:
type: viterbi
unique_wer_file: true
common_eval:
results_path: ???
path: ???
post_process: letter
dataset:
max_tokens: 1100000
gen_subset: ???

What's your environment?

  • fairseq Version ( 0.12.2):
  • PyTorch Version (2.2.2)
  • OS (Linux):
  • How you installed fairseq (pip):
  • Build command you used (pip install --no-build-isolation --editable ./):
  • Python version: 3.9.19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant