Skip to content

ESPnet Version 202204

Compare
Choose a tag to compare
@kan-bayashi kan-bayashi released this 12 Apr 04:36
· 6868 commits to master since this release
48c3510

News

From this version, we decided to use date-based versioning, e.g., v.202204.

New Features

  • [New Features][ESPnet1] added learnable fourier features #4029 by @popcornell
  • [New Features][ESPnet1][ESPnet2][ASR] Restricted Self Attention for E2E Speech Summarization #4071 by @roshansh-cmu
  • [New Features][ESPnet1][Installation][README] add lrs avsr recipe #4104 by @wentaoxandry
  • [New Features][ESPnet1][README] add lip reading sentences dataset code #4074 by @wentaoxandry
  • [New Features][ESPnet2][ASR] [ESPnet2] Intermediate/Self-conditioned CTC #4084 by @YosukeHiguchi
  • [New Features][ESPnet2][ASR] [WIP] [ESPnet2] Mask-CTC #4158 by @YosukeHiguchi
  • [New Features][ESPnet2][ASR][README] Add stochastic depth to conformer and share results on LibriSpeech 960h #4142 by @pyf98
  • [New Features][ESPnet2][MT] MT task for espnet2 with IWSLT14 recipe #4111 by @siddalmia
  • [New Features][ESPnet2][README][SE] Add DC-CRN complex masking and spectral mapping approach for speech enhancement #4127 by @Emrys365
  • [New Features][ESPnet2][README][SE] Add DCCRN separator #4097 by @Johnson-Lsx
  • [New Features][ESPnet2][README][SE] Add a new separator for speech enhancement/separation tasks #4062 by @LiChenda
  • [New Features][ESPnet2][README][SE] Add iFaSNet for enhancement/separation tasks. #4130 by @LiChenda
  • [New Features][ESPnet2][SE] Refactor DNN_Beamformer in espnet2 and add new beamformers #4082 by @Emrys365

Enhancement

  • [Enhancement][ESPnet2] Add an optional suffix to the averaged model file name #4067 by @pyf98
  • [Enhancement][ESPnet2] Update perturb_data_dir_speed.sh #4091 by @AmirHussein96
  • [Enhancement][ESPnet2][ASR] Add tests for Intermediate/Self-conditioned CTC #4117 by @YosukeHiguchi
  • [Enhancement][ESPnet2][TTS] Add option to use norm. feats over denorm. #4250 by @G-Thor

Recipe

  • [Recipe][ESPnet1][RNNT] [ESPNET1] Add the results of conformer-transducer for Librispeech #4080 by @eesungkim
  • [Recipe][ESPnet2][ASR] Add ASR recipe for VCTK dataset based on TTS's dataprep. #4088 by @kashikashi
  • [Recipe][ESPnet2][ASR] Add new conformer config with hop length 160 for LibriSpeech 960h #4162 by @pyf98
  • [Recipe][ESPnet2][ASR] Add new zh_openslr38 ASR recipe #4181 by @cuichenx
  • [Recipe][ESPnet2][ASR] Add transformer results for LibriSpeech 100h #4089 by @pyf98
  • [Recipe][ESPnet2][ASR] Added Marathi OpenSLR 64 recipe #4179 by @SujaySKumar
  • [Recipe][ESPnet2][ASR] Added recipe for Microsoft Speech Corpus (Indian languages) #4194 by @chintu619
  • [Recipe][ESPnet2][ASR] Automatic lyric recognition Recipe #4129 by @ftshijt
  • [Recipe][ESPnet2][ASR] ESPNET - LRS3 Recepie #4101 by @gdebayan
  • [Recipe][ESPnet2][ASR] bengali asr model with no finetuning #4047 by @dzeinali
  • [Recipe][ESPnet2][MT] IWSLT'14 Results using ESPnet2-MT #4132 by @pyf98
  • [Recipe][ESPnet2][README] Mandarin ISO id should be CMN instead of ZHO #4125 by @xinjli
  • [Recipe][ESPnet2][README] Update README.md #4037 by @dzeinali
  • [Recipe][ESPnet2][README] Update README.md #4121 by @dzeinali
  • [Recipe][ESPnet2][README] Update README.md for How2 2000h ASR,SUM #4155 by @roshansh-cmu
  • [Recipe][ESPnet2][RNNT] Create decode_rnnt_conformer.yaml #4058 by @sw005320
  • [Recipe][ESPnet2][RNNT] Create train_rnnt_conformer.yaml #4057 by @sw005320
  • [Recipe][ESPnet2][SLU] Add IEMOCAP results and configs #4100 by @YushiUeda
  • [Recipe][ESPnet2][SLU] Add new config and support for computing WER in SLUE-VoxCeleb #4152 by @siddhu001
  • [Recipe][ESPnet2][SLU] Add sentiment data preparation for IEMOCAP #4065 by @YushiUeda
  • [Recipe][ESPnet2][SLU] ESPnet2 swbd_sentiment recipe #4134 by @YushiUeda
  • [Recipe][ESPnet2][ST] egs2/iwslt22_dialect #4013 by @brianyan918

Bugfix

  • [Bugfix][CI][ESPnet2] Fix CI test failures related to torch_complex 0.4.0 #4112 by @Emrys365
  • [Bugfix][CI][Installation] fix doc ci by pinning jinja version #4239 by @xinjli
  • [Bugfix][ESPnet2] Fix n-gram decoding #4168 by @sw005320
  • [Bugfix][ESPnet2] bug fixes and efficient train/dev split in data prep of Microsoft Indian Languages recipe #4196 by @chintu619
  • [Bugfix][ESPnet2] fix errors in configs of librispeech ssl frontends #4098 by @simpleoier
  • [Bugfix][ESPnet2][ASR][ST] [bug patch] egs2/iwslt22_dialect #4049 by @brianyan918
  • [Bugfix][ESPnet2][MT][ST] Fix joint tokenization in st.sh #4143 by @pyf98
  • [Bugfix][ESPnet2][MT][ST] scoring fixes MT and ST #4146 by @siddalmia
  • [Bugfix][ESPnet2][TTS] Fix speaker normalization #4229 by @LanceaKing
  • [Bugfix][Installation] set gtn version #4122 by @brianyan918
  • [Bugfix][ESPnet1][ESPnet2] minor fixes in ST in espnet2 #4056 by @siddalmia

Others

Acknowledgements

Special thanks to @AmirHussein96, @Emrys365, @Fhrozen, @G-Thor, @JDongian, @Johnson-Lsx, @LanceaKing, @LiChenda, @ShigekiKarita, @SujaySKumar, @YosukeHiguchi, @YushiUeda, @brianyan918, @chintu619, @cuichenx, @dzeinali, @eesungkim, @ftshijt, @gdebayan, @kan-bayashi, @karthik19967829, @kashikashi, @ooyamatakehisa, @popcornell, @pyf98, @roshansh-cmu, @siddalmia, @siddhu001, @simpleoier, @sw005320, @wentaoxandry, @xinjli.