Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SLUE_PERB codebase #5685

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open

Add SLUE_PERB codebase #5685

wants to merge 33 commits into from

Conversation

siddhu001
Copy link
Collaborator

What?

Add required codebase for training SLUE-PERB models

@mergify mergify bot added the ESPnet2 label Feb 28, 2024
@sw005320 sw005320 added SLU Spoken language understanding SSL self-supervised learning Recipe labels Feb 28, 2024
@sw005320 sw005320 added this to the v.202405 milestone Feb 28, 2024
@@ -35,7 +35,7 @@ fi

if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
if [ ! -e "${VOXPOPULI}/LICENSE.txt" ]; then
echo "stage 1: Download data to ${VOXPOPULI}"
echo "stage 1: Please download data to ${VOXPOPULI}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be good to also print a link to dataset download instructions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in the README

@siddhu001 siddhu001 marked this pull request as ready for review March 8, 2024 21:22
@mergify mergify bot added the README label Mar 9, 2024
Copy link

codecov bot commented Mar 11, 2024

Codecov Report

Attention: Patch coverage is 27.48538% with 248 lines in your changes are missing coverage. Please review.

Project coverage is 75.71%. Comparing base (5180c85) to head (0e9953b).
Report is 1 commits behind head on master.

❗ Current head 0e9953b differs from pull request most recent head 0e7be99. Consider uploading reports for the commit 0e7be99 to get more accurate results

Files Patch % Lines
...net2/slu/postencoder/conformer_full_postencoder.py 17.79% 97 Missing ⚠️
espnet2/slu/espnet_model.py 27.69% 47 Missing ⚠️
espnet2/bin/slu_inference.py 13.04% 40 Missing ⚠️
espnet2/slu/postencoder/Featurizer.py 26.31% 28 Missing ⚠️
espnet2/asr/encoder/whisper_encoder.py 48.27% 15 Missing ⚠️
espnet2/asr/espnet_model.py 55.55% 8 Missing ⚠️
espnet2/asr/encoder/e_branchformer_encoder.py 12.50% 7 Missing ⚠️
espnet2/tasks/asr.py 70.00% 3 Missing ⚠️
espnet2/tasks/slu.py 70.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #5685       +/-   ##
===========================================
+ Coverage   32.51%   75.71%   +43.20%     
===========================================
  Files         768      749       -19     
  Lines       70372    69738      -634     
===========================================
+ Hits        22878    52802    +29924     
+ Misses      47494    16936    -30558     
Flag Coverage Δ
test_configuration_espnet2 ∅ <ø> (?)
test_integration_espnet1 62.92% <ø> (?)
test_integration_espnet2 48.79% <22.62%> (-0.02%) ⬇️
test_python_espnet1 18.23% <0.00%> (?)
test_python_espnet2 52.57% <27.48%> (?)
test_python_espnetez ?
test_utils 20.91% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sw005320
Copy link
Contributor

@pyf98, can you also review this PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the filename capitalized?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi I want to keep the same name as class name of s3prl function I copied: https://github.com/s3prl/s3prl/blob/66b54fd85980b9a822a7e5ce6f342b7d0edfcc23/s3prl/nn/upstream.py#L234. But I agree, I think it is better to keep it standard and not capitalized.

@pyf98
Copy link
Collaborator

pyf98 commented Mar 12, 2024

I have a high-level discussion about the desgin:
Should we add more components into the asr task? I'm recently feeling that the ASR task is becoming more complicated, but some features are not really for ASR.
If we just want to use a pre-trained ASR model, another way is to load the checkpoint using the ASR task and manipulate the encoded features (e.g., weighted sum, or additional modules) in another downstream module like SLU.

@siddhu001
Copy link
Collaborator Author

Hi @pyf98,

Thanks a lot for the suggestion!

Yes, I had to modify the ASR files because, for summarization, we need to pre-train the model first on the ASR task and then fine-tune it on the summarization task. Although I made only a few changes to the ASR file, I agree with you that, in general, the ASR files are becoming complicated. In future, I will ensure to only manipulate the encoded features while fine-tuning the downstream task. However, for this PR, to ensure reproducibility with the results I obtained, I believe adding these changes to the ASR file is may be necessary.

Copy link
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you added some extensions related to the postdconcer and some changes in e-branchformer and whisper encoders.
Can you add a test for these parts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some information about each corpus to https://github.com/espnet/espnet/blob/master/egs2/README.md?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that some configs have the corresponding models while others are not.
If you intentionally did it, that is fine.
If not, maybe we can upload a model (at least major ones, e.g., whisper and wavlm)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.
Please deal with model upload appropriately.

@@ -161,11 +162,22 @@
type_check=AbsEncoder,
default="rnn",
)
prepostencoder_choices = ClassChoices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to explain what is prepostencoder in the source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it go through the test?
If not, please add it.

@sw005320
Copy link
Contributor

Hi @siddhu001, can you reflect on my comments?

@siddhu001
Copy link
Collaborator Author

siddhu001 commented Apr 24, 2024

Yes I will reflect on your comments and update the PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ESPnet2 README Recipe SLU Spoken language understanding SSL self-supervised learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants