-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSL w/o torchaudio dependency #5537
base: master
Are you sure you want to change the base?
Conversation
This pull request is now in conflict :( |
@wanchichen, can you restart this PR? |
This pull request is now in conflict :( |
@wanchichen, let’s finish this PR. @simpleoier, please also review this PR. |
@@ -512,11 +520,6 @@ def train_one_epoch( | |||
): | |||
assert isinstance(batch, dict), type(batch) | |||
|
|||
if distributed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I performed several experiments (ASR, SSL) and found that this was useless. But we may need to experiment with other batch samplers to make sure
@@ -709,7 +731,7 @@ def train_one_epoch( | |||
for iopt, optimizer in enumerate(optimizers): | |||
if optim_idx is not None and iopt != optim_idx: | |||
continue | |||
optimizer.zero_grad() | |||
optimizer.zero_grad(set_to_none=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the recommended setting by PyTorch now, it is both faster and more memory efficient. However, it may also slightly affect the learning curve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I made some comments.
One question I have is about the espnet2/ssl/mask
. What is the benefit of a new mask module? If it is only used in HuBERT models during pre-training, a single function in the hubert model would be enough.
egs2/librispeech/ssl1/conf/tuning/train_ssl_espnethubert_base_960h_pretrain_it1.yaml
Outdated
Show resolved
Hide resolved
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
What?
This PR allows for HuBERT pre-training w/o using torchaudio, allowing for more customization and use of different ESPnet components. It also introduces some tricks to better support large-scale training.
Features:
Supports only HuBERT for Transformer and E-Branchformer so far.
To do:
hubert.sh
which implementation to use