ESPnet version 202301
What's Changed
- Initialize VISinger branch by @ftshijt in #4683
- Update VISInger branch by @ftshijt in #4705
- Update UASR branch with latest ESPnet functions by @ftshijt in #4752
- Update uasr by @ftshijt in #4770
- Shell scripts for UASR processing by @ftshijt in #4769
- Uasr python scripts by @DongjiGao in #4791
- Update visinger by @ftshijt in #4818
- Update test_custom_transducer.py by @sw005320 in #4826
- Update asr.sh by @sw005320 in #4827
- Fixed pad mode for librosa.stft by @Masao-Someki in #4832
- Add E-Branchformer models in some recipes by @pyf98 in #4833
- Fix data prep in GigaSpeech by @pyf98 in #4836
- time sync decoding for asr by @brianyan918 in #4792
- Remove duplicated VOXFORGE in db.sh (line81 and line157) by @pyf98 in #4840
- Fix argument parsing for non_linguistic_symbols in asr.sh by @pyf98 in #4841
- Add a warning statement when the hypo length equals to the max out length. by @pengchengguo in #4843
- Add target speaker extraction (TSE) functions by @Emrys365 in #4823
- Multilingual superb by @ftshijt in #4824
- VISinger by @jerryuhoo in #4689
- Update VISInger to latest by @ftshijt in #4849
- VISinger for singing voice synthesis by @ftshijt in #4848
- Reduce word counts for ESPnet-SE++ Joss paper by @neillu23 in #4844
- Add E-Branchformer configs and models in ASR recipes by @pyf98 in #4837
- Address Muskits updates on README by @ftshijt in #4850
- Minor fix for MSUPERB recipe by @ftshijt in #4851
- Update for the latest changes in the draft (minor changes) by @neillu23 in #4852
- Add E-Branchformer results on Librispeech by @kkim-asapp in #4856
- Update hubert implementation. by @simpleoier in #4747
- VISinger unit test by @jerryuhoo in #4855
- Minor fix to commonvoice espnet1 by @ftshijt in #4862
- [WIP] Add S4 decoder in ESPnet2 by @m-koichi in #4845
- Update hubert feature and acknowledge information in related Readmes. by @simpleoier in #4863
- Generating MFA aligments by @Fhrozen in #4803
- [WIP] EURO uasr scripts by @DongjiGao in #4846
- Update README.md related to ASR architecture by @m-koichi in #4865
- Minor fix to librimix diar recipe by @ftshijt in #4867
- Add Full Whisper Model for Finetuning by @slSeanWU in #4793
- Add torchaudio version check for HuBERT pretraining by @simpleoier in #4872
- add k2 decoder related scripts for EURO by @DongjiGao in #4868
- EURO: small fix (temporarily remove support for nbest_rescoring) by @DongjiGao in #4875
- Add description for Whisper ASR in homepage readme by @slSeanWU in #4877
- Update README.md by @eltociear in #4879
- add explanations to text tokenizing related scripts and remove unused script by @DongjiGao in #4880
- update information about source and our modification for k2 related scripts by @DongjiGao in #4881
- AphasiaBank ASR recipe by @tjysdsg in #4860
- Multilingual SUPERB update by @ftshijt in #4878
- ESPnet Unsupervised ASR (EURO project) by @ftshijt in #4774
- Support ProDiff in TTS by @Fhrozen in #4808
- Add E-Branchformer for GigaSpeech by @pyf98 in #4882
- FLEURS - Auxillary CTC conditioning tasks by @wanchichen in #4756
- Add python 3.8 requirement for Whisper & update tests by @slSeanWU in #4891
- Update some ASR results in the main readme file by @pyf98 in #4883
- Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe by @tjysdsg in #4892
- Support x-vector extractor based on RawNet by @Takaaki-Saeki in #4884
- single language track setups by @DanBerrebbi in #4895
- fixing bug deu1 by @DanBerrebbi in #4900
- Fix dataprep issues based on updated data release via Google form by @roshansh-cmu in #4899
- Add a new EGS2 recipe 'reazonspeech' by @fujimotos in #4885
- Update version to 202301 by @kan-bayashi in #4901
New Contributors
- @DongjiGao made their first contribution in #4791
- @jerryuhoo made their first contribution in #4689
- @m-koichi made their first contribution in #4845
- @fujimotos made their first contribution in #4885
Full Changelog: v.202211...v.202301