Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multires hubert #5363

Merged
merged 10 commits into from Feb 26, 2024
Merged

Multires hubert #5363

merged 10 commits into from Feb 26, 2024

Conversation

ftshijt
Copy link
Contributor

@ftshijt ftshijt commented Oct 31, 2023

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • Did you read the contributor guideline?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Adding new multiresolution HuBERT implementation to fairseq https://arxiv.org/abs/2310.02720

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

TODOs (in progress)

  • Add training configs/preprocessing scripts
  • Add documentations
  • Upload pre-trained models

untie_final_proj: true
activation_dropout: 0.0
conv_adapator_kernal: 1
use_single_target: true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ftshijt, first thank you for this interesting research and the pre-training scripts. If I made it right, according to the paper (Appendix B.4), predicting a single target is proved a little bit worse than doing multi-task, why in all the configs it is set to true please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

This argument has a different meaning: it refers to using the same discrete token sequence for both low-resolution and high-resolution (while the low-resolution is generated by skip-downsampling). We also support using different tokens (refer to Appendix B.7).

For multi-task vs. single-task, we have a "use_single_prediction" argument for that and the default is false, so that we can focus on using multi-task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But thanks for pointing out the confusion, I will later update a few notes in the README to make it clear to future users.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, now it is more clear to me. I was just confused by the term, will read the code thoroughly.

fairseq/tasks/multires_hubert_pretraining.py Show resolved Hide resolved
fairseq/models/multires_hubert/multires_hubert.py Outdated Show resolved Hide resolved
fairseq/models/multires_hubert/multires_hubert.py Outdated Show resolved Hide resolved
fairseq/models/multires_hubert/multires_hubert.py Outdated Show resolved Hide resolved
fairseq/models/multires_hubert/multires_hubert.py Outdated Show resolved Hide resolved
@ftshijt
Copy link
Contributor Author

ftshijt commented Jan 23, 2024

Many thanks for the code review! @annasun28
The PR is ready to merge on my side. Could you please help merge the PR?

@annasun28 annasun28 merged commit 34973a9 into facebookresearch:main Feb 26, 2024
1 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants