Multires hubert #5363

ftshijt · 2023-10-31T16:06:10Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Adding new multiresolution HuBERT implementation to fairseq https://arxiv.org/abs/2310.02720

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

TODOs (in progress)

Add training configs/preprocessing scripts
Add documentations
Upload pre-trained models

BenoitWang · 2023-11-16T15:45:36Z

examples/mr_hubert/config/pretrain/mrhubert_base_librispeech.yaml

+  untie_final_proj: true
+  activation_dropout: 0.0
+  conv_adapator_kernal: 1
+  use_single_target: true


Hi @ftshijt, first thank you for this interesting research and the pre-training scripts. If I made it right, according to the paper (Appendix B.4), predicting a single target is proved a little bit worse than doing multi-task, why in all the configs it is set to true please?

Thanks for the review!

This argument has a different meaning: it refers to using the same discrete token sequence for both low-resolution and high-resolution (while the low-resolution is generated by skip-downsampling). We also support using different tokens (refer to Appendix B.7).

For multi-task vs. single-task, we have a "use_single_prediction" argument for that and the default is false, so that we can focus on using multi-task.

But thanks for pointing out the confusion, I will later update a few notes in the README to make it clear to future users.

Thank you, now it is more clear to me. I was just confused by the term, will read the code thoroughly.

fairseq/tasks/multires_hubert_pretraining.py

fairseq/models/multires_hubert/multires_hubert.py

ftshijt · 2024-01-23T09:08:38Z

Many thanks for the code review! @annasun28
The PR is ready to merge on my side. Could you please help merge the PR?

ftshijt added 2 commits October 31, 2023 08:45

multires hubert core

f2aa0a1

update core codebase on multiresolution hubert

ad39629

facebook-github-bot added the CLA Signed label Oct 31, 2023

ftshijt added 2 commits November 6, 2023 03:06

add examples

30f3c98

adding entries to pretrained models (not finished)

2dea97a

ftshijt mentioned this pull request Nov 16, 2023

Multiresolution HuBERT as a new upstream s3prl/s3prl#515

Closed

BenoitWang reviewed Nov 16, 2023

View reviewed changes

ftshijt added 2 commits November 20, 2023 11:15

add other abalation models

f85a103

add multilinugal

de4b755

ftshijt marked this pull request as ready for review November 28, 2023 08:18

ftshijt mentioned this pull request Nov 28, 2023

Add Multiresolution HuBERT as an additional upstream model s3prl/s3prl#517

Merged

annasun28 approved these changes Dec 4, 2023

View reviewed changes

ftshijt added 2 commits December 7, 2023 00:27

add decode.sh train.sh finetune.sh and update links for README.md

38644f5

fix readme

3b2a4ed

annasun28 approved these changes Jan 16, 2024

View reviewed changes

clean the codebase

5aaabf6

Merge branch 'main' into multires_hubert

bfcaf17

annasun28 merged commit 34973a9 into facebookresearch:main Feb 26, 2024
1 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multires hubert #5363

Multires hubert #5363

ftshijt commented Oct 31, 2023 •

edited

BenoitWang Nov 16, 2023

ftshijt Nov 16, 2023

ftshijt Nov 16, 2023

BenoitWang Nov 16, 2023

ftshijt commented Jan 23, 2024

Multires hubert #5363

Multires hubert #5363

Conversation

ftshijt commented Oct 31, 2023 • edited

Before submitting

What does this PR do?

PR review

TODOs (in progress)

BenoitWang Nov 16, 2023

Choose a reason for hiding this comment

ftshijt Nov 16, 2023

Choose a reason for hiding this comment

ftshijt Nov 16, 2023

Choose a reason for hiding this comment

BenoitWang Nov 16, 2023

Choose a reason for hiding this comment

ftshijt commented Jan 23, 2024

ftshijt commented Oct 31, 2023 •

edited