-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Multiresolution HuBERT as an additional upstream model #517
Conversation
Minor fix to s3prl-vc to new librosa versions
…t -- profile black)
@leo19941227 Hi Leo, the PR is ready for review now (I also fix the huggingface repo for a range of pre-trained models). I will continue to add corresponding docs to the documentation page. |
Hi @ftshijt ! Sure, I will review the changes this weekend. Thanks so much! |
multires_hubert_base | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- Unlabled Speech: LibriSpeech 960hr | ||
- K-means extracted from `hubert_base`_ | ||
|
||
|
||
multires_hubert_large | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- Unlabeled Speech: LibriLight 60khr | ||
- K-means extracted from `hubert_base`_ | ||
|
||
|
||
multires_hubert_multilingual_base | ||
--------------------------------- | ||
|
||
- Unlabeled Speech: Voxpopuli 100khr | ||
- K-means extracted from `hubert_base`_ | ||
|
||
|
||
multires_hubert_multilingual_large400k | ||
-------------------------------------- | ||
|
||
- Unlabeled Speech: Voxpopuli 100khr | ||
- K-means extracted from `hubert_base`_ | ||
- Training steps 400k | ||
|
||
|
||
multires_hubert_multilingual_large600k | ||
-------------------------------------- | ||
|
||
- Unlabeled Speech: Voxpopuli 100khr | ||
- K-means extracted from `hubert_base`_ | ||
- Training steps 600k | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the K-means extraction source, the current document implies that all the models are using the K-means from the 2nd iteration of HuBERT Base, could you help double check if this is true?
For example, this means multires_hubert_base
is the 3rd iteration of the HuBERT training, which might not be directly comparable to hubert_base
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the current note is true. All the provided models shall be considered as the 3rd iteration of the HuBERT-training (not directly comparable to hubert_base). In our provided paper, we also conducted the experiments with the "comparable" hubert using the same 3rd iteration k-means.
In that case, do you suggest me submit those models (3rd iteration hubert) to the s3prl as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think adding the 3rd iteration of HuBERT is not necessary. I was just curious and making sure that the note is correct. I have no further issue then. (You can still submit the 3rd iteration HuBERT if you want)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ftshijt !
I have a few comments, please help check it.
Thanks!
Many thanks for the review! I've removed the unused legacy note, but I'm not sure if we would also like to add the 3rd iteration hubert-base as discussed above. But I feel even in that case, I can probably simply add that to the s3prl/hubert huggingface repo and leave it there for people who are interested in a fair comparison between hubert and mr-hubert (let me know if you have any other better ideas~). |
Hi @ftshijt , I am good with the current changes and I am ready to merge this. |
Thanks again for the review. I would prefer to merge this PR. For the 3rd iteration hubert, I will commit that to the huggingface repo directly for reference purposes. |
Sounds good! |
As per discussion in #515 ,this PR adds the multiresolution HuBERT as an additional upstream model.
TODOs:
hubconf.py
)Reference PR in fairseq:
Btw, during the evaluation of vc task, also fixed a minor bug related to new updates in librosa