ESPnet version 202304
What's Changed
- Update collect stats stage so that less memory cost in Utt_mvn by @simpleoier in #4888
- Apply the latest black by @kamo-naoyuki in #4907
- Add pytorch=1.13.1 to CI configuration by @kamo-naoyuki in #4906
- How2 fix README, incorrect url by @roshansh-cmu in #4902
- standardized inference and number of iterations for mSuperb single lang track by @DanBerrebbi in #4905
- Fix typo in lrs/README.md by @eltociear in #4911
- MSUPERB setting update by @ftshijt in #4913
- Update test_import.yaml to install numba by @kamo-naoyuki in #4918
- update pyopenjtalk version to 0.3.0 by @kan-bayashi in #4912
- CHiME-7 Task1 recipe by @popcornell in #4894
- Update CHiME-7 Task 1 README.md by @popcornell in #4920
- Use native CPU version of STFT on newer pytorch versions, fix librosa window size < ftt by @bmilde in #4922
- Add few shot subset for mSuperb multilingual setting by @guapaQAQ in #4923
- Fix existing bugs in the TSE task by @Emrys365 in #4915
- IAM OCR recipe updates by @kenzheng99 in #4927
- Fixing some issues with chime7-task1 baseline by @popcornell in #4925
- set default none decoder for ASR by @ftshijt in #4917
- Update inference and training setting for mSuperb multilingual model by @guapaQAQ in #4932
- Add E-Branchformer Transducer results by @pyf98 in #4933
- add tf-gridnet by @zqwang7 in #4864
- Fixes + Channel Selection for CHiME-7 Task by @popcornell in #4934
- fix extracted feature dummy generation by @roshansh-cmu in #4926
- Fix device mismatch error in GPU decoding with PyTorch 1.13 by @pyf98 in #4941
- CHiME-7 DASR MD5 checksum fix for mixer6/train_call by @popcornell in #4942
- Update show_asr_result.sh by @kamo-naoyuki in #4943
- CHiME-7 DASR correct development results by @popcornell in #4946
- Fix 'floordiv is deprecated' warnings by @fujimotos in #4945
- Added WSLII installation instruction by @sw005320 in #4949
- Update Muskits by @A-Quarter-Mile in #4931
- Set a longer time execution threshold for related failed time-outs CI by @ftshijt in #4962
- Modify data prep for mSUPERB multilingual by @guapaQAQ in #4965
- Add E-Branchformer results in some recipes by @pyf98 in #4958
- Add 'six' as a required Python module by @fujimotos in #4964
- add msuperb linguistic analysis by @hhhaaahhhaa in #4938
- Fix a 'ref_channel'-related issue in espnet2/bin/enh_inference.py by @Emrys365 in #4972
- Add E-Branchformer results in slurp_entity by @pyf98 in #4971
- Add Conformer and E-Branchformer results in fisher_spanish_callhome ASR by @pyf98 in #4976
- [SVS] Add Joint-training by @A-Quarter-Mile in #4977
- Update the chunk iterator for the TSE task by @Emrys365 in #4929
- update msuperb LID scoring script by @hhhaaahhhaa in #4979
- add multilingual+lid lid score generation by @hhhaaahhhaa in #4982
- Add python=3.10 to CI by @kamo-naoyuki in #4627
- LID score v2 by @hhhaaahhhaa in #4983
- Fix ci by @kamo-naoyuki in #4985
- Change to use Ubuntu-latest instead of Ubuntu-18.04 in CI by @kamo-naoyuki in #4986
- Remove six by @kamo-naoyuki in #4988
- Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc. by @kamo-naoyuki in #4997
- Fix Whisper tokenizer CI error by @slSeanWU in #5004
- fix s3prl upstream attribute bug by @jwrh in #5003
- [Recipe] Add iwslt22 low resource speech translation task for egs2 by @freddy5566 in #4994
- Fix typeguard version by @silvanocerza in #5009
- Add .pre-commit-config.yaml by @kamo-naoyuki in #5011
- Copy Kaldi utils/steps/sid and add a new github action to check the consistency by @kamo-naoyuki in #4998
- Modfiy .pre-commit-config.yaml by @kamo-naoyuki in #5012
- Modify .pre-commit-config.yaml by @kamo-naoyuki in #5014
- Modify .pre-commit-config.yaml by @kamo-naoyuki in #5015
- [Tuning] iwslt22 low-resource ST decode configuration tuning by @freddy5566 in #5019
- Modify asr.sh by @kamo-naoyuki in #5020
- [SVS] Improve visinger by @jerryuhoo in #5022
- Use scripts/utils/print_args.sh instead of pyscripts/utils/print_args.py by @kamo-naoyuki in #5025
- Add docstring in extra_path.sh by @kamo-naoyuki in #5028
- Update installation.md by @kamo-naoyuki in #5029
- Update README.md by @kamo-naoyuki in #5030
- Update README.md by @kamo-naoyuki in #5031
- Change bc to python by @kamo-naoyuki in #5032
- Update tools/Makefile and path.sh by @kamo-naoyuki in #5027
- Fix for format_wav_scp.py by @kamo-naoyuki in #5038
- Add execute permission to install_ice_g2p.sh by @kamo-naoyuki in #5040
- Bug fix of #5025 by @kamo-naoyuki in #5039
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #5041
- Update README.md by @kamo-naoyuki in #5042
- Update README.md by @kamo-naoyuki in #5043
- Update README.md by @kamo-naoyuki in #5045
- Fix in gen_task1_data.sh from CHiME7 by @boeddeker in #4953
- Update README.md by @eml914 in #5044
- Add installers/install_ffmpeg.sh by @kamo-naoyuki in #5046
- Fix broken links reported by #5048 by @ShigekiKarita in #5050
- fix: resolve upgrade issues with praatio 6.0; lock praatio version by @timmahrt in #4978
- Add miniconda in gitignore by @pyf98 in #5052
- CHiME-7 DASR fixes from participants feedback by @popcornell in #4999
- Fix the condition for maxlen warning in beam search by @pyf98 in #5055
- Fixed SQLalchemy version for MFA by @Fhrozen in #5059
- Support Multi-Blank Transducer in Espnet2 by @jctian98 in #4876
- Fix chime7 DASR task1 run.sh by @kamo-naoyuki in #5060
- CHiME-7 DASR recipe, fix display bug for scenario-wide DER and JER by @popcornell in #5061
- Add test_format_wav_scp_sh.bats by @kamo-naoyuki in #5062
- Update documentation by @kamo-naoyuki in #5063
- Support SOT training on LibriMix data. by @pengchengguo in #4861
- Update check_install.py by @kamo-naoyuki in #5066
- Tedlium3 recipe by @Some-random in #5068
- Bug Fix: pretrained s3prl-frontend based models loaded with parameters key mismatch error by @simpleoier in #5074
- Mechanism for multi channels input using multi columns wav.scp by @kamo-naoyuki in #5075
- Clean ML-SUPERB by @ftshijt in #5067
- CHiME-7 DASR: first diarization system based on Pyannote. by @popcornell in #5054
- Chime7-task1 diarization (updated results) by @popcornell in #5088
- Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files by @tjysdsg in #5084
- [SVS] Bug fix: sample rate by @A-Quarter-Mile in #5094
- [SVS] Extend SingingGenerate by @A-Quarter-Mile in #5100
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #5080
- Add kaldi steps/libs by @kamo-naoyuki in #5106
- Fix sentencepice version to v0.1.97 by @kamo-naoyuki in #5107
- Drop PyTorch<=1.9 by @kamo-naoyuki in #5111
- Update installers/install_kenlm.sh by @kamo-naoyuki in #5110
- Merge */{scripts,pyscripts} into asr1/{scripts,pyscripts} by @kamo-naoyuki in #5109
- Update ReazonSpeech training recipe for v1.1.0 by @fujimotos in #5114
- Fix typo in espnet2_format_wav_scp.md by @boeddeker in #5116
- Dtype for Speechbrain by @Fhrozen in #5112
- Add test of soundfile for Makefile by @kamo-naoyuki in #5119
- Add lm_inference for conditional text generation by @pyf98 in #5122
- CHiME-7 diarization (updated README.md) by @popcornell in #5102
- [WIP] Update Docker by @Fhrozen in #5128
- Fix several bugs and improve function design in SE by @Emrys365 in #5103
- [SVS] Update XiaoiceSing by @A-Quarter-Mile in #5124
- Add missing filter_scps scripts and note about kaldi for diarization example of mini_librispeech by @toto6038 in #5139
- Bump up the debian version to 11 by @kamo-naoyuki in #5144
- Bug fixing and improvement in SE functions by @Emrys365 in #5143
- Add data augmentation to ReazonSpeech recipe by @fujimotos in #5127
- Update error calculator for transducer by @aky15 in #5097
- Add streaming speech enhancemnt inference. by @LiChenda in #5049
- Update README.md about debian by @sw005320 in #5146
- Fix issues in split scps by @pyf98 in #5138
- fix 5148 by @kamo-naoyuki in #5149
- fix format_wav_scp.py by @kamo-naoyuki in #5150
- Add more stats to the training log by @Emrys365 in #5147
- update version to 202304 by @kan-bayashi in #5151
New Contributors
- @bmilde made their first contribution in #4922
- @guapaQAQ made their first contribution in #4923
- @zqwang7 made their first contribution in #4864
- @hhhaaahhhaa made their first contribution in #4938
- @jwrh made their first contribution in #5003
- @freddy5566 made their first contribution in #4994
- @silvanocerza made their first contribution in #5009
- @pre-commit-ci made their first contribution in #5041
- @boeddeker made their first contribution in #4953
- @timmahrt made their first contribution in #4978
- @Some-random made their first contribution in #5068
- @toto6038 made their first contribution in #5139
Full Changelog: v.202301...v.202304