Releases: espnet/espnet
ESPnet Version 0.9.7
New Feature
- [New Features][ESPnet1][ASR] Option for GTN CTC mode #2866 by @brianyan918
- [New Features][ESPnet2][SE][README] Update to speech enhancement task #2649 by @LiChenda
- [New Features][ESPnet2][ASR][README] Lightweight Sinc Convolutions for Espnet2 #2768 by @lumaku
- [New Features][ESPnet2][Documentation] --freeze_param option #2787 by @kamo-naoyuki
- [New Features][ESPnet2][TTS][README] Add a new G2P
pyopenjtalk_accent_with_pause
#2843 by @kan-bayashi - [New Features][ESPnet2][TTS][README] Add pyopenjtalk_accent g2p for ESPnet2 TTS #2781 by @ota
- [New Features][ESPnet2][TTS][README] Support X-vector based multi-speaker TTS model in ESPnet2 #2800 by @kan-bayashi
Enhancement
- [Enhancement][ESPnet1][ESPnet2] Add version info in args #2841 by @kan-bayashi
- [Enhancement][ESPnet1][ESPnet2][ASR] AMI Recipe (Short UTT checker) #2802 by @ftshijt
- [Enhancement][Installation] add default activate_python.sh #2788 by @kamo-naoyuki
- [Enhancement][Installation] modified: check_install.py #2834 by @kamo-naoyuki
- [Enhancement][Installation][Documentation][ESPnet1][ESPnet2] Change version info location #2840 by @kan-bayashi
Bugfix
- [Bugfix][ESPnet1][ASR] fix greedy decoding #2812 by @b-flo
- [Bugfix][ESPnet2][ASR] Fix the compatibility of the pretrained ASR model #2794 by @kan-bayashi
- [Bugfix][Installation] Fix #2799 #2830 by @kamo-naoyuki
- [Bugfix][Installation] Fix HTS engine installation #2825 by @kan-bayashi
- [Bugfix][Installation] fix the incorrect $PATH setting in tools/extra_path.sh #2833 by @jumon
- [Bugfix][Recipe][ESPnet1][ASR] Minor fixes in CSJ #2837 by @YosukeHiguchi
- [Bugfix][Recipe][ESPnet1][ASR] fix receipe bug for librispeech #2735 by @yuekaizhang
- [Bugfix][Recipe][ESPnet2][ASR] fix a config name #2729 by @sw005320
- [Bugfix][Recipe][ESPnet2][ASR][README] Fix dirha_wsj recipe #2747 by @kamo-naoyuki
- [Bugfix][Recipe][ESPnet2][TTS] Add missing decoding configs in LibriTTS recipe #2827 by @kan-bayashi
Recipe
- [Recipe][ESPnet1][ASR] Add LibriSpeech Conformer results for LibriCSS #2861 by @akreal
- [Recipe][ESPnet1][ASR] Update Commonvoice Recipe with Conformer Settings #2739 by @ftshijt
- [Recipe][ESPnet1][ASR] Update Russian open STT recipe for v1.01 of the dataset #2776 by @akreal
- [Recipe][ESPnet1][ASR] Update models and results of Conformer. #2765 by @pengchengguo
- [Recipe][ESPnet1][ESPnet2][ASR][README] ESPnet2 recipe for commonvoice #2793 by @hchung12
- [Recipe][ESPnet1][VC][README] VCC2020 database #2754 by @unilight
- [Recipe][ESPnet2][ASR][README] Update Dirha WSJ result #2756 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][README] espnet2 hkust recipe #2863 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][README] update the AMI result in espnet2 #2817 by @sw005320
- [Recipe][ESPnet2][ASR][README] updated the laborotv result #2750 by @sw005320
- [Recipe][ESPnet2][ASR][README] Update reverb result #2876 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR] Minor fix of laborotv recipe #2877 by @hfujihara
- [Recipe][ESPnet2][TTS] Fix total number of iterations #2813 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Add libritts recipe for ESPnet2 #2807 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Add x-vector based configs for VCTK #2808 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Minor update TTS README #2818 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update JSUT TTS results #2792 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update JSUT results #2809 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update JSUT results #2871 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update LibriTTS results #2842 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update VCTK results #2814 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] Update libritts results #2828 by @kan-bayashi
- [Recipe][ESPnet2][TTS][README] update latest CSMSC link address #2777 by @meowtech
Other
- [CI][Documentation][Installation] Change warp-ctc and warp-transducer to extra #2748 by @kamo-naoyuki
- [CI][README] Update ci setting #2848 by @kan-bayashi
- [ASR][Documentation][ESPnet2] Sinc Convolutions - add documentation for plot_sinc_filters.py #2782 by @lumaku
- [Documentation][ESPnet1] fixed some typos #2855 by @jumon
- [Documentation][Installation] Update documentation #2757 by @kamo-naoyuki
- [Installation][Refactoring] Move the dependencies coming from recipes #2740 by @kamo-naoyuki
Acknowledgements
Special thanks to @AdolfVonKleist, @LiChenda, @YosukeHiguchi, @akreal, @b-flo, @brianyan918, @ftshijt, @hchung12, @hfujihara, @jumon, @kamo-naoyuki, @kan-bayashi, @lumaku, @meowtech, @ota, @pengchengguo, @sw005320, @unilight, @yuekaizhang.
ESPnet Version 0.9.6
New Feature
- [New Features][ESPnet2] Wandb integration #2707 by @kamo-naoyuki
- [New Features][ESPnet2][ASR] Add ignore_nan_grad option for CTC #2699 by @kamo-naoyuki
- [New Features][ESPnet2][SE] Touching common modules before the main Enh PR #2705 by @LiChenda
Bug fix
- [Bugfix][ESPnet1] bug fix for pytorch1.7 #2656 by @kamo-naoyuki
- [Bugfix][ESPnet1][ESPnet2][TTS] Use
nkf
in CSMSC data prep #2726 by @kan-bayashi - [Bugfix][ESPnet2] Fix flooring for global_mvn.py #2623 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix small bug of tensorboard part #2702 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix wandb mode with multi gpus #2709 by @kamo-naoyuki
- [Bugfix][ESPnet2][TTS] Fix token averaged feature the case when r > 1 #2704 by @kan-bayashi
Recipe
- [Recipe][ESPnet1] Extend model averaging condition in run scripts #2613 by @b-flo
- [Recipe][ESPnet1][ASR] Enable multi-thread processing of json files. #2681 by @Peidong-Wang
- [Recipe][ESPnet1][ASR] Update KsponSpeech conformer results #2624 by @jubang0219
- [Recipe][ESPnet1][ASR] Update Voxforge with Conformer results #2642 by @YosukeHiguchi
- [Recipe][ESPnet1][ASR] lang was being used before being parsed for user input #2654 by @siddalmia
- [Recipe][ESPnet1][ASR][ESPnet2][Installation][README] espnet2 reverb recipe #2691 by @kamo-naoyuki
- [Recipe][ESPnet1][ASR][README] Update Switchboard with conformer results #2697 by @Emrys365
- [Recipe][ESPnet1][ASR][README] add librispeech conformer w/ speed perturbation + specaug #2617 by @yuekaizhang
- [Recipe][ESPnet2][ASR] ASR template recipe: --srctexts -> --lm_train_text, --bpe_train_text #2660 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR] Add $token_type to asr_tag and lm_tag #2625 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][Installation][README][Recipe] Laborotv recipe #2703 by @sw005320
- [Recipe][ESPnet2][ASR][README] Add AISHELL w/o LM result #2718 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][README] ESPnet2 recipe for TIMIT #2568 by @sknadig
- [Recipe][ESPnet2][ASR][README] JSUT conformer recipe achieving 12.0/13.9 CER(%) for dev/eval1 #2720 by @hchung12
- [Recipe][ESPnet2][ASR][README] Update README.md #2659 by @sw005320
- [Recipe][ESPnet2][ASR][README] Update WSJ result #2628 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][README] espnet2 librispeech with conformer #2687 by @sw005320
- [Recipe][ESPnet2][README] Corpus README in egs2 #2713 by @sw005320
- [Recipe][ESPnet2][README] update egs2/README.md #2719 by @Emrys365
Enhancement
- [Enhancement][Documentation][ESPnet2] Add --init_param option #2680 by @kamo-naoyuki
- [Enhancement][ESPnet1][ASR] Save model snapshot at every epoch even if save_interval_iters > 0 - for model averaging #2637 by @sknadig
- [Enhancement][ESPnet2] Update wandb part #2708 by @kamo-naoyuki
- [Enhancement][ESPnet2][ASR] Add *_stats_dir options in asr.sh #2724 by @kan-bayashi
Documentation
- [Documentation][ESPnet2][README] Update egs2 README #2723 by @kan-bayashi
- [Documentation][ESPnet2][README][TTS] Update README about fine-tuning #2685 by @kan-bayashi
- [Documentation][ESPnet2][README][TTS] Update TTS README.md #2650 by @kan-bayashi
Refactoring
- [Refactoring][ESPnet1][ASR][README] Refactor Mask CTC non-autoregressive ASR #2223 by @YosukeHiguchi
- [Refactoring][ESPnet2] Added unicode support for generated configs #2672 by @Piteryo
Others
- [Installation] python setup.py install -> pip install -e #2619 by @kamo-naoyuki
- [Installation][Refactoring] modify for zsh: tools/extra_path.sh #2696 by @kamo-naoyuki
- [Docker] Docker flags for extra libraries (VC) #2622 by @Fhrozen
Acknowledgements
Special thanks to @Emrys365, @Fhrozen, @LiChenda, @Peidong-Wang, @Piteryo, @YosukeHiguchi, @b-flo, @hchung12, @jubang0219, @kamo-naoyuki, @kan-bayashi, @siddalmia, @sknadig, @sw005320, @yuekaizhang.
ESPnet Version 0.9.5
New Features
- [New Features][ESPnet2][TTS] Support
g2p=none
for text with phonemes #2551 by @kan-bayashi - [New Features][ESPnet2][TTS] Add MCD evaluation script for ESPnet2-TTS #2554 by @kan-bayashi
- [New Features][ESPnet1][ST] Conformer End-to-End Speech Translation #2523 by @hirofumi0810
Bugfix
- [Bugfix][ESPnet1] CTC segmentation - package update #2566 by @lumaku
- [Bugfix][ASR][ESPnet1] fix bug about att_ws in multi-enc case #2549 by @lzm0706
- [Bugfix][ESPnet1] Conformer averaging model support for pytorch 1.6 #2604 by @siddalmia
- [Bugfix][ESPnet1][ASR] Set built-in CTC for asr_recog #2588 by @lumaku
- [Bugfix][ESPnet1][ASR][Installation] Transducer float16 loss bug fix #2496 by @GNroy
Refactoring
Recipe
- [Recipe][ESPnet1][ASR] Alignment recipe for CSJ. #2531 by @jnishi
- [Recipe][ESPnet1][ASR] New Recipe for KsponSpeech (Korean spontaneous speech; 969 hours) #2555 by @jubang0219
- [Recipe][ESPnet1][ASR] Update TedLium3 conformer results #2600 by @LiChenda
- [Recipe][ESPnet1][ASR] Update VIVOS models #2574 by @b-flo
- [Recipe][ESPnet1][ASR] Update model link in Puebla-Nahuatl #2607 by @ftshijt
- [Recipe][ESPnet1][ASR] Update tedlium2 with conformer results #2599 by @Emrys365
- [Recipe][ESPnet1][ASR] update the JSUT recipe with conformer #2546 by @sw005320
- [Recipe][ESPnet2][ASR] Add CSJ conformer config #2560 by @kan-bayashi
- [Recipe][ESPnet2][ASR] Add CSJ conformer results #2552 by @kan-bayashi
- [Recipe][ESPnet2][ASR] Small changes for aishell config #2586 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR] Update espnet2 AISHELL results #2580 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR] update JSUT espnet2 with pre-trained models #2563 by @sw005320
- [Recipe][ESPnet2][TTS] Add JSSS recipe for ESPnet2-TTS #2558 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Update ESPnet2 TTS result #2542 by @kan-bayashi
CI
- [CI][Documentation] Support espnet2/bin in sphinx doc. #2544 by @ShigekiKarita
- [CI][Installation][README] Add pytorch1.7.0 ci test #2605 by @kamo-naoyuki
Other
- [Installation] Install warpctc-pytorch wheel when torch version is 1.1 - 1.6 #2547 by @ysk24ok
- [Installation] Modified requirements: "dataclasses; python_version < '3.7'", #2541 by @kamo-naoyuki
- [Installation] Remove pip3 check in setup_python.sh #2567 by @ShigekiKarita
Acknowledgements
Special thanks to @Emrys365, @GNroy, @LiChenda, @ShigekiKarita, @b-flo, @ftshijt, @hirofumi0810, @jnishi, @jubang0219, @kamo-naoyuki, @kan-bayashi, @lumaku, @lzm0706, @siddalmia, @sw005320, @ysk24ok.
ESPnet Version 0.9.4
New Features
- [New Features][ESPnet1][ASR] Transducer v4 #2444 by @b-flo
- [New Features][ESPnet2] Support audio_format=flac.ark, wav.ark #2451 by @kamo-naoyuki
- [New Features][ESPnet2][ASR] Support conformer encoder in ESPnet2 ASR #2515 by @kan-bayashi
Bugfix
- [Bugfix][ESPnet1] Fixed IndexError in BatchBeamSearch.post_process() (#2483) #2484 by @kan-bayashi
- [Bugfix][ESPnet1][LM] fix multigpu bug if pytorch>=1.5 #2492 by @kamo-naoyuki
- [Bugfix][ESPnet2] remove cleaner #2529 by @kamo-naoyuki
- [Bugfix][ESPnet2][TTS] Fix TTS inference bug for GST + Fastspeech2 #2498 by @kan-bayashi
Documentation
- [Documentation] Update espnet2_tutorial.md #2528 by @kamo-naoyuki
- [Documentation] Update espnet2_tutorial.md #2532 by @kamo-naoyuki
- [Documentation] Update espnet2_tutorial.md #2534 by @kamo-naoyuki
- [Documentation] Update notebook submodule #2499 by @kan-bayashi
- [Documentation][ESPnet1] Small fixes for transducer #2514 by @b-flo
- [Documentation][ESPnet2][README][TTS] Update ESPnet2 TTS README #2516 by @kan-bayashi
- [Documentation][README] Update README #2504 by @kan-bayashi
- [Documentation][README][ESPnet1] CTC segmentation - checks for blank chars and RNN models #2535 by @lumaku
Recipe
- [Recipe][ESPnet1][ASR] add conformer results for librispeech #2510 by @yuekaizhang
- [Recipe][ESPnet2][ASR] Update ESPnet2 CSJ Transformer results #2497 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Add results for ESPnet2 TTS #2503 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Update Transformer-TTS config #2494 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Update Transformer-TTS configs #2502 by @kan-bayashi
Refactoring
- [Refactoring] Modify uttid to "${spkid}-${uttid}" for trn files #2527 by @kamo-naoyuki
- [Refactoring][ESPnet1][ASR][LM] Remove all future lines #2481 by @ShigekiKarita
- [Refactoring][ESPnet1][ASR][MT][ST] Unify arguments #2506 by @hirofumi0810
- [Refactoring][ESPnet1][ESPnet2][TTS] Refactor length regulator to improve the speed #2482 by @kan-bayashi
- [Refactoring][ESPnet1][MT][ST] Refactor decoding for translation tasks #2501 by @hirofumi0810
- [Refactoring][ESPnet2] Change add_scalars to add_scalar for tensorboard SummaryWriter #2525 by @kamo-naoyuki
CI
- [CI][ASR] Make test_e2e_asr.py faster #2488 by @ShigekiKarita
- [CI][ASR] Make test_e2e_asr_maskctc.py faster. #2493 by @ShigekiKarita
- [CI][ASR] Make test_recog.py faster #2486 by @ShigekiKarita
- [CI][ESPnet1][ASR] make test_e2e_asr_mulenc.py faster #2480 by @ruizhilijhu
- [CI][ESPnet1][Installation] Update shellcheck url. #2500 by @ShigekiKarita
- [CI][ESPnet2][Installation] Limit test execution time to 2.0 sec #2520 by @ShigekiKarita
- [CI][SE] Make test_beamformer_net.py faster #2489 by @ShigekiKarita
- [CI][SE] shorten test time for tasnet #2491 by @LiChenda
Other
- [Installation] Update h5py version to avoid errors in Python3.8 #2519 by @shigabeev
- [Docker] Docker Updates #2509 by @Fhrozen
Acknowledgements
Special thanks to @Fhrozen, @LiChenda, @ShigekiKarita, @b-flo, @hirofumi0810, @kamo-naoyuki, @kan-bayashi, @lumaku, @ruizhilijhu, @shigabeev, @yuekaizhang.
ESPnet Version 0.9.3
New Features
- [New Features][ESPnet2] Implement --grad_clip_type #2399 by @kamo-naoyuki
- [New Features][ESPnet2][ASR] Implement batch_score() method for ASR decoder and LM #2377 by @kamo-naoyuki
- [New Features][ESPnet2][README][TTS] Support Conformer-based FastSpeech / FastSpeech2 #2413 by @kan-bayashi
Bugfix
- [Bugfix][CI][ESPnet1][ESPnet2] make sure chainer independent #2411 by @kamo-naoyuki
- [Bugfix][CI][ESPnet1][Installation] Revert ctc seg installation #2392 by @kan-bayashi
- [Bugfix][CI][Installation] Fix the installation error in CI #2476 by @kan-bayashi
- [Bugfix][ESPnet1][ASR] Lazy import chainer in asr_utils.py #2407 by @kamo-naoyuki
- [Bugfix][ESPnet1][ASR] asr: Fix recog issue on Transformer CTC model #2394 by @jaesong
- [Bugfix][ESPnet1][MT][ST] Fix score_bleu.sh #2400 by @hirofumi0810
- [Bugfix][ESPnet1][README][Typo] fixed typo in egs/README.md #2473 by @mrazizi
- [Bugfix][ESPnet1][TTS] lazy import chainer: espnet/nets/tts_interface.py #2409 by @kamo-naoyuki
- [Bugfix][ESPnet2] Add missing database in db.sh #2427 by @kan-bayashi
- [Bugfix][ESPnet2] Fix the CommonPreprocessor_multi missing issue #2460 by @LiChenda
- [Bugfix][ESPnet2] Minor fix of egs2/commonvoice/asr1/local/data.sh #2438 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix the directory for init_file_prefix #2412 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix typo of log_level choices #2472 by @glynpu
- [Bugfix][ESPnet2][ASR] Add grep -H option #2388 by @kamo-naoyuki
- [Bugfix][ESPnet2][TTS] Fix wrong sum axis in energy extraction #2469 by @kan-bayashi
- [Bugfix][ESPnet2][Typo] Fix typo in help comment and docstrings #2470 by @kan-bayashi
- [Bugfix][Installation] add warpctc_pytorch version==0.1.2 #2403 by @kamo-naoyuki
Documentation
- [Documentation] Add bug report template #2396 by @sw005320
- [Documentation] Add installation issue template #2397 by @sw005320
- [Documentation] Update espnet2_distributed.md #2418 by @kamo-naoyuki
- [Documentation] Update espnet2_distributed.md #2419 by @kamo-naoyuki
- [Documentation] Update espnet2_training_option.md #2421 by @kamo-naoyuki
- [Documentation] Update faq.md #2431 by @kamo-naoyuki
- [Documentation] Update parallelization.md #2428 by @kamo-naoyuki
- [Documentation][ESPnet2][README] Update README.md #2430 by @kamo-naoyuki
Enhancement
- [Enhancement][ESPnet1][ESPnet2] Add -c option for multi GPUs mode for slurm.conf #2406 by @kamo-naoyuki
- [Enhancement][ESPnet1][Installation] Install warpctc-pytorch wheel when torch version is 1.1, 1.2 or 1.3 #2453 by @ysk24ok
- [Enhancement][ESPnet1][README] ADD CSJ RNN pretrained model #2452 by @jnishi
- [Enhancement][ESPnet2] Update db.sh #2426 by @kamo-naoyuki
- [Enhancement][ESPnet2][TTS] Update ESPnet2 TTS config #2468 by @kan-bayashi
- [Enhancement][ESPnet2][TTS] Update and add fastspeech2 configs #2429 by @kan-bayashi
- [Enhancement][Installation] Add sanity check for setup_cuda_env.sh #2389 by @kamo-naoyuki
- [Enhancement][Installation] Change cudatoolkit to cuda if cuda_version=8.0 #2405 by @kamo-naoyuki
- [Enhancement][Installation] Change to refer https://anaconda.org/pytorch/pytorch/files #2404 by @kamo-naoyuki
- [Enhancement][Installation] Workaround for soundfile issue #2437 by @kamo-naoyuki
Recipe
- [Recipe][ESPnet1][ASR] Add LibriCSS recipe #2246 by @akreal
- [Recipe][ESPnet1][ASR] Update for the Official Split of YM Recipe #2435 by @ftshijt
- [Recipe][ESPnet1][ESPnet2][ASR] Update CommonVoice for Latest Version #2455 by @ftshijt
- [Recipe][ESPnet2][ASR] [zeroth korean] Not to use pipe format if feats_type=raw #2402 by @kamo-naoyuki
- [Recipe][ESPnet2][ASR][README] espnet2 zeroth_korean recipe changing feats_type from fbank_pitch to raw. #2393 by @hchung12
- [Recipe][ESPnet2][README][TTS] Add ESPnet2 TTS finetuning example recipe (JVS) #2465 by @kan-bayashi
CI
- [CI] Add codecov actions. #2467 by @ShigekiKarita
- [CI] Fix hangup of unittests #2424 by @kamo-naoyuki
- [CI] Make espnet2 tts test faster #2461 by @kan-bayashi
- [CI] Make test_e2e_{asr,st,mt}_{transformer,conformer}.py faster. #2464 by @ShigekiKarita
- [CI] Update .gitignore #2434 by @kan-bayashi
- [CI][ESPnet1] Make test_(batch_)beam_search.py faster. #2462 by @ShigekiKarita
- [CI][ESPnet1] Support Debian9 and CentOS7 in Github Actions #2457 by @ShigekiKarita
- [CI][ESPnet1][Installation] Fix HKUST recipe #2440 by @kamo-naoyuki
Acknowledgements
Special thanks to @LiChenda, @ShigekiKarita, @akreal, @ftshijt, @glynpu, @hchung12, @hirofumi0810, @jaesong, @jnishi, @kamo-naoyuki, @kan-bayashi, @mrazizi, @sw005320, @ysk24ok.
ESPnet Version 0.9.2
New Features
- [New Features][ESPnet1] CTC segmentation #2301 by @lumaku
- [New Features][ESPnet2] Support multiple averaged nbest models #2353 by @kamo-naoyuki
- [New Features][ESPnet2] Support recursive add in pack_funcs and add images to packed model #2367 by @kamo-naoyuki
Bugfix
- [Bugfix][ASR][ESPnet1] remove ff_scale from conformer constructor arguments #2356 by @koji-okabe-hub
- [Bugfix][ASR][ESPnet2] use lm_exp instead of lm_tag for inference_tag #2352 by @kamo-naoyuki
- [Bugfix][CI][ESPnet1][Installation] Remove ctc_segmentation temporary #2385 by @kan-bayashi
- [Bugfix][ESPnet1] Fix import error of conformer module #2384 by @kan-bayashi
- [Bugfix][ESPnet1] Fix issue #2211 #2219 by @Emrys365
- [Bugfix][ESPnet2] Add missing init.py #2326 by @kan-bayashi
- [Bugfix][ESPnet2] Fix --out_filename option: format_wav_scp.sh #2348 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix amp #2362 by @kamo-naoyuki
- [Bugfix][ESPnet2] add egs2/an4/asr1/local/path.sh #2343 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix recursive add: espnet2/main_funcs/pack_funcs.py #2369 by @kamo-naoyuki
- [Bugfix][ESPnet2] remove unused import #2331 by @kamo-naoyuki
- [Bugfix][ESPnet2][Installation][Typo] fix typo #2344 by @kamo-naoyuki
- [Bugfix][ESPnet2][README] Fix typo #2372 by @Piteryo
- [Bugfix][ESPnet2][TTS] make vietnamese_cleaner to opiton #2341 by @kamo-naoyuki
- [Bugfix][Installation] Fix python version check for chainer #2342 by @kamo-naoyuki
- [Bugfix][Installation] add undefined variable: check_pytorch_cuda_compatibility.py #2361 by @kamo-naoyuki
- [Bugfix][TTS] Fix device allocation error in guided attention loss #2282 #2317 by @kan-bayashi
Documentation
- [Documentation] updated comment on the documentation #2351 by @GauravPandey892
- [Documentation][ESPnet2] Update TTS README #2316 by @kan-bayashi
- [Documentation][ESPnet2][README] Update ESPnet2 TTS README #2376 by @kan-bayashi
- [Documentation][ESPnet2][README][TTS] Update README #2330 by @kan-bayashi
- [Documentation][Installation] Devide setup_python.sh into setup_venv.sh and setup_python.sh #2382 by @kamo-naoyuki
- [Documentation][Installation] add a description about check install. #2360 by @sw005320
- [Documentation][README] CTC segmentation - Demo #2347 by @lumaku
- [Documentation][README] Update README.md #2379 by @kamo-naoyuki
Enhancement
- [Enhancement][ESPnet2] Change the default inference model to averaged model instead of the best #2346 by @kamo-naoyuki
- [Enhancement][ESPnet2][TTS] Add pitch and energy stats in packing #2350 by @kan-bayashi
- [Enhancement][Installation] Add checking for pytorch-cuda compatibility in Makefile #2334 by @kamo-naoyuki
- [Enhancement][Installation] Show raw error message when failed to import packages #2374 by @kamo-naoyuki
Refactoring
- [Refactoring] Apply new version black #2366 by @kamo-naoyuki
- [Refactoring][ASR][ESPnet2] Not to add _sp to $asr_exp if --asr_exp option is specified #2368 by @kamo-naoyuki
- [Refactoring][CI][ESPnet1][ESPnet2][Installation] Add installers for sctk and sph2pipe and create tools/extra_path.sh #2332 by @kamo-naoyuki
- [Refactoring][ESPnet1][Recipe] Disable preparation for lm in wsj recipe #2373 by @kamo-naoyuki
- [Refactoring][ESPnet2] Update Task design #2345 by @kamo-naoyuki
- [Refactoring][ESPnet2][SE] Remove unused option from enh.sh:--feats_normalize #2325 by @kamo-naoyuki
Recipe
- [Recipe][ASR][ESPnet1] MGB-2 #2289 by @AmirHussein96
- [Recipe][ASR][ESPnet1] Remove duplicated class definition of Conformer and update some new results of Aishell1 and Switchboard. #2364 by @pengchengguo
- [Recipe][ASR][ESPnet2][README] ASR WSJ RESULT update: Tuning LM #2355 by @kamo-naoyuki
- [Recipe][ASR][ESPnet2][README] add pretrained model link #2378 by @kamo-naoyuki
CI
- [CI][README] Update ubuntu images in circle ci #2349 by @ShigekiKarita
- [CI][mergify] Update .mergify.yml #2333 by @kamo-naoyuki
- [CI][mergify] Update .mergify.yml #2354 by @kamo-naoyuki
Acknowledgements
Special thanks to @AmirHussein96, @Emrys365, @GauravPandey892, @Piteryo, @ShigekiKarita, @kamo-naoyuki, @kan-bayashi, @koji-okabe-hub, @lumaku, @pengchengguo, @sw005320.
ESPnet Version 0.9.1
New Features
- [New Features] Add metric option to checkpoint averaging for Transformer #2259 by @hirofumi0810
- [New Features][ESPnet2] Generate run.sh in the experiment dir for resuming #2284 by @kamo-naoyuki
- [New Features][ESPnet2] Support larger num_iters_per_epoch than the number of batches in small corpus #2255 by @kamo-naoyuki
- [New Features][ESPnet2] Support torch native automatic mixed precision for espnet2 #2257 by @kamo-naoyuki
Documentation
- [Documentation] Update comments in MultiHeadAttention #2266 by @placebokkk
- [Documentation][ESPnet2] append comment in reporter.py #2267 by @kamo-naoyuki
- [Documentation][ESPnet2][README][TTS] Add ESPnet2 TTS recipe document #2312 by @kan-bayashi
Enhancement
- [Enhancement][ESPnet2] Tensorboard stats between iterations #2252 by @kamo-naoyuki
Refactoring
- [Refactoring][ESPnet2] Add some new features and a new recipe for the enhancement task #2238 by @Emrys365
- [Refactoring][Documentation] Remove installation part of Python from Makefile #2245 by @kamo-naoyuki
Recipe
- [Recipe][ASR] aidatatang conformer ESPnet1 recipe #2269 by @nzhoward
- [Recipe][ESPnet2] espnet2 zeroth_korean recipe #2279 by @hchung12
Bug fix
- [Bugfix] Fix #2295 #2311 by @kan-bayashi
- [Bugfix] Minor fix for Makefile #2268 by @kamo-naoyuki
- [Bugfix] Not to install cupy-cuda* for python>=3.8 #2277 by @kamo-naoyuki
- [Bugfix] Remove channel: setup_anaconda.sh #2303 by @kamo-naoyuki
- [Bugfix][ASR] ngram single decoding bug fix #2299 by @qmpzzpmq
- [Bugfix][ASR][ESPnet2] Add missing init.py #2292 by @kamo-naoyuki
- [Bugfix][ASR][ESPnet2] decode -> inference #2276 by @kamo-naoyuki
- [Bugfix][ASR][ESPnet2] remove chainer dependency from show_asr_result.sh #2281 by @kamo-naoyuki
- [Bugfix][ESPnet2] Avoid illegal summary name for tensorboard #2294 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix average_nbest_models for pytorch=1.6 #2283 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix decode config extension in ESPnet2 CSJ recipe #2258 by @kan-bayashi
- [Bugfix][ESPnet2] Fix for queue-freegpu.pl #2274 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix samplers about min_batch_size #2305 by @kamo-naoyuki
- [Bugfix][ESPnet2] Workaround for SGE jobname issue #2253 by @kamo-naoyuki
- [Bugfix][ESPnet2] add missing shebang #2306 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix bug of reporter #2263 by @kamo-naoyuki
- [Bugfix][ESPnet2][Recipe] Update zeroth_korean #2308 by @kamo-naoyuki
- [Bugfix][ESPnet2][SE] add --spk-num 1 #2285 by @kamo-naoyuki
- [Bugfix][ESPnet2][distributed] Not to save config.yaml if rank!=0 #2287 by @kamo-naoyuki
Others
- [CI] Remove unnecessary installation when CI #2307 by @kamo-naoyuki
- [CI] Take integration tests into coverage #2254 by @ShigekiKarita
- [CI][ESPnet2] Add coverage measure for espnet2 integration test #2256 by @kamo-naoyuki
- [CI][Installation] Install wheel #2304 by @kamo-naoyuki
Acknowledgements
Special thanks to @Emrys365, @ShigekiKarita, @hchung12, @hirofumi0810, @kamo-naoyuki, @kan-bayashi, @nzhoward, @placebokkk, @qmpzzpmq.
ESPnet Version 0.9.0
New Features
- [New Features][ASR] Non-autoregressive ASR with Mask CTC #2070 by @YosukeHiguchi
- [New Features][ASR] Support Conformer model. #2144 by @pengchengguo
- [New Features][ASR][ST] CTC posterior visualization during training #2221 by @hirofumi0810
- [New Features][ESPnet2] Implement espnet2.bin.zenodo_upload #2168 by @kamo-naoyuki
- [New Features][ESPnet2] Python API for inference #2092 by @kamo-naoyuki
- [New Features][ESPnet2] Support TTS-Transformer in ESPnet2 #2134 by @kan-bayashi
- [New Features][ESPnet2][ASR] Enable batch joint decoding with CTC in recog API v2 #2197 by @takaaki-hori
- [New Features][ESPnet2][SE] Speech Enhancement Frontend for ESPNet2 Phase 1 #2124 by @LiChenda
- [New Features][ESPnet2][TTS] Support FastSpeech for ESPnet2 TTS #2149 by @kan-bayashi
- [New Features][ESPnet2][TTS] Support FastSpeech2 (+FastPitch) #2218 by @kan-bayashi
- [New Features][ESPnet2][TTS] Support GST in ESPnet2 TTS #2139 by @kan-bayashi
- [New Features][README][ASR] CTC forced alignment in E2E ASR Transformer model #2095 by @simpleoier
- [New Features][VC] Voice Transformer Network #2064 by @unilight
Enhancement
- [Enhancement] Fix error when downloading large files using
download_from_google_drive.sh
#2074 by @unilight - [Enhancement][ASR] added more beam search info #2130 by @sw005320
- [Enhancement][ESPnet2] Change packed file of espnet2 to zip format #2161 by @kamo-naoyuki
- [Enhancement][ESPnet2] Make read_text faster #2114 by @kamo-naoyuki
- [Enhancement][ESPnet2] RESULTS.md -> README.md #2077 by @kamo-naoyuki
- [Enhancement][ESPnet2] Remove long wave in template recipe #2075 by @kamo-naoyuki
- [Enhancement][ESPnet2] Update ESPnet2 JSUT TTS recipe and TTS template #2110 by @kan-bayashi
- [Enhancement][MT][ST] Fix ST/MT models for compatibility with ASR #2179 by @hirofumi0810
- [Enhancement][ST] Add source case information to json files in ST task #2208 by @hirofumi0810
- [Enhancement][ST] Refactor multi-task learning in ST #2202 by @hirofumi0810
Recipe
- [Recipe][ASR] Add aidatatang_200zh recipe #2122 by @nzhoward
- [Recipe][ASR] Add chime6 info #2250 by @sw005320
- [Recipe][ASR] CHiME-6 recipe #2171 by @GNroy
- [Recipe][ASR] Fix a bug in espnet wsj recipe. #2145 by @houwenxin
- [Recipe][ASR] New Recipe for Yoloxóchitl-Mixtec (SLR89) #2085 by @ftshijt
- [Recipe][ASR] Support averaging model for Conformer. #2244 by @pengchengguo
- [Recipe][ASR] Updated model after tuning aidatatang_200zh recipe #2204 by @nzhoward
- [Recipe][ASR] created a recipe to run asr on ljspeech #1996 by @ibkuroyagi
- [Recipe][ASR] updatemodel link (add pre-trained bpe model and lm model) #2101 by @ftshijt
- [Recipe][ESPnet2][ASR] espnet2 librispeech recipe #2109 by @sw005320
- [Recipe][ESPnet2][ASR] espnet2 librispeech v2 #2189 by @sw005320
- [Recipe][ESPnet2][ASR] update espnet2 aishell results #2150 by @Cescfangs
- [Recipe][ESPnet2][ASR][TTS] fix dev_set/eval_sets issues #2142 by @sw005320
- [Recipe][ESPnet2][TTS] Add ESPnet2 CSMSC TTS recipe #2129 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Add ESPnet2 LJSpeech recipe #2117 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Add VCTK recipe for ESPnet2 TTS #2165 by @kan-bayashi
- [Recipe][ESPnet2][TTS] Create espnet2 jsut/tts recipe #2047 by @kamo-naoyuki
Refactoring
- [Refactoring][ESPnet2] Change stats_dir naming not to overwrite #2111 by @kan-bayashi
- [Refactoring][ESPnet2] Move modules #2086 by @kamo-naoyuki
- [Refactoring][ESPnet2] Remove $KALDI_ROOT/tools/env.sh from path.sh #2242 by @kamo-naoyuki
- [Refactoring][ESPnet2] Several update for pretrain model #2212 by @kamo-naoyuki
- [Refactoring][ESPnet2] Update Makefile #2225 by @kamo-naoyuki
Documentation
- [README] Fix URL in README #2090 by @kan-bayashi
- [README] Update README about TTS #2079 by @kan-bayashi
- [README] Update README.md #2046 by @kamo-naoyuki
- [README] Update README.md #2067 by @kamo-naoyuki
- [README] Update README.md #2243 by @kamo-naoyuki
- [README] Update citation #2206 by @hirofumi0810
- [README] Update installation.md #2233 by @kamo-naoyuki
- [README][ESPnet2] Update egs2/TEMPLATE/README.md #2098 by @kamo-naoyuki
Bugfix
- [Bugfix] Add cupy.done in make python #2091 by @kan-bayashi
- [Bugfix] Append a missing space in cmd-line args in utils/dump_pcm.sh #2209 by @yistLin
- [Bugfix] Fix Makefile #2097 by @kamo-naoyuki
- [Bugfix] Fix minor bug of Makefile #2055 by @kamo-naoyuki
- [Bugfix] Fix old model compatibility #2048 #2060 #2063 by @kan-bayashi
- [Bugfix] Fix pretrained model #2053 #2069 by @kan-bayashi
- [Bugfix] Fix pyopenjtalk installation #2108 by @kan-bayashi
- [Bugfix] Fix typo in run.sh of TTS recipes #2216 by @hirofumi0810
- [Bugfix] Update Makefile to disable cupy for cuda=10.2 or later #2230 by @kamo-naoyuki
- [Bugfix] fix path of PESQ #2058 by @kamo-naoyuki
- [Bugfix] scorerinterface warning English correction #2076 by @qmpzzpmq
- [Bugfix][CI] Fix bug in attention plotting #2185 by @hirofumi0810
- [Bugfix][CI] Freeze the matplotlib version with 3.1.0 #2181 by @sw005320
- [Bugfix][CI] fix integration_test_ctc_align_wav.bats with a small model #2170 by @simpleoier
- [Bugfix][CI] temporally disable subsample 6 and 8 tests #2205 by @sw005320
- [Bugfix][CI][MT][ST] Add integration test for ST/MT tasks #2210 by @hirofumi0810
- [Bugfix][ESPnet2] Add missing path.sh in egs2/vctk/tts1 #2167 by @kan-bayashi
- [Bugfix][ESPnet2] Fix TTS inference #2222 by @kan-bayashi
- [Bugfix][ESPnet2] Fix
tts_inference
whenfeats_extract
is None #2176 by @kan-bayashi - [Bugfix][ESPnet2] Fix bug for feats_type=extracted #2087 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix bug of iterable dataset when num_workers>=1 #2081 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix bug of when espnet2/bin/tokenize_text.py --cutoff or --vocabulary_size is used #2158 by @kamo-naoyuki
- [Bugfix][ESPnet2] Fix log: benchmark -> deterministic #2080 by @kamo-naoyuki
- [Bugfix][ESPnet2] Implement configargparse in espnet2 #2157 by @kamo-naoyuki
- [Bugfix][ESPnet2] Select torchaudio version according to torch version #2214 by @kamo-naoyuki
- [Bugfix][ESPnet2] avoid UnboundLocalError when lm is not loaded #2227 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix #2050 #2051 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix #2198: PhonemeTokenizer can't perform with multiprocessing #2201 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix best_model_criterion: wsj/asr1/conf/tuning/train_lm.yaml #2153 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix bug of lm.py #2056 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix the stage number: enh.sh #2220 by @kamo-naoyuki
- [Bugfix][ESPnet2] fix: decode_config -> inference_config #2239 by @kamo-naoyuki
- [Bugfix][ESPnet2][Recipe] Not removing short/long utterances for eval_sets #2112 by @kamo-naoyuki
- [Bugfix][ESPnet2][SE] Fix bugs in espnet2/enh and format related directory structures #2215 by @Emrys365
- [Bugfix][ESPnet2][TTS] Fix feature extractor of TTS for compatibility #2102 by @kamo-naoyuki
Acknowledgements
Special thanks to @Cescfangs, @Emrys365, @GNroy, @LiChenda, @YosukeHiguchi, @ftshijt, @hirofumi0810, @houwenxin, @ibkuroyagi, @kamo-naoyuki, @kan-bayashi, @nzhoward, @pengchengguo, @qmpzzpmq, @simpleoier, @sw005320, @takaaki-hori, @unilight, @yistLin.
ESPnet Version 0.8.0
ESPnet2
- [ESPnet2] Solve memory issue with super large corpus training #1972 by @kamo-naoyuki
- [ESPnet2] Added model parameter count to trainer #1867 by @SeanNaren
- [ESPnet2] Refactoring espnet2/utils/fileio.py -> espnet2/fileio #1807 by @kamo-naoyuki
New Features
- [New Features] Lightweight and Dynamic Convolutions. #1599 by @yuyfujit
- [New Features] Implement Ngram scorer #1946 by @qmpzzpmq
- [New Features] resampling in utils/compute-fbank-feats.py and utils/compute-stft-feats.py #2035 by @kamo-naoyuki
Enhancement
Documentation
- [Documentation] fix a typo for the decoder add_argument_group #2030 by @sw005320
- [Documentation] Update multiple GPU descriptions. #2016 by @sw005320
- [Documentation] Finetuning doc + freezing parameters option #1897 by @b-flo
Bugfix
- [Bugfix] Fix memory issue when resuming #2040 by @kamo-naoyuki
- [Bugfix] fixed typo in cmvn.py #1988 by @gullyboy007
- [Bugfix] update notebook #1986 by @ShigekiKarita
- [Bugfix] Fix freezing modules (when using multi-gpu) #1983 by @atozto9
- [Bugfix] Fix BLEU/PPL calculation during training #2009 by @hirofumi0810
- [Bugfix] Fix download file extension #2042 by @takenori-y
- [Bugfix] fix tedlium2/3 model link #2032 by @sw005320
- [Bugfix] Fix bug for pure Transformer-CTC #2023 by @hirofumi0810
- [Bugfix] li42 recipe: add li42 results; fix bug in adding language id "zh_TW" #1950 by @houwenxin
CI
- [CI] Add espnet2 in ci/doc.sh #1976 by @ShigekiKarita
- [CI] Add test for pytorch1.5 #1881 by @kamo-naoyuki
Acknowledgements
Special thanks to @SeanNaren, @ShigekiKarita, @atozto9, @b-flo, @gullyboy007, @hirofumi0810, @houwenxin, @kamo-naoyuki, @qmpzzpmq, @sw005320, @takenori-y, @yuyfujit.
ESPnet Version 0.7.0
Now, the ESPnet project moves on to a new endeavor! We launched espnet2, which aims to refine the modularities (chainer-free, kaldi-free), use a more customizable trainer, support distributed training, and achieve the scalability mainly led by @kamo-naoyuki with his great efforts and leadership. This project is one of the outcomes of our ESPnet hackathon in Tokyo 2019 with a lot of discussions about the design, new features, and community contributions. espnet2 currently supports main ASR recipes (with a well-designed recipe template) and limited TTS recipes. We maintain both espnet1 and espnet2, but gradually move to our development in espnet2. The ESPnet project is further accelerated!
ESPnet2
- [ESPnet2] keep the latest model #1769 by @kamo-naoyuki
- [ESPnet2] Remove "E2E" from all comments #1766 by @kamo-naoyuki
- [ESPnet2] Refactoring for ESPnetDataset #1758 by @kamo-naoyuki
- [ESPnet2] Implement SpecAug for ESPnet2 #1746 by @kamo-naoyuki
- [ESPnet2] Implement BatchBinSampler #1742 by @kamo-naoyuki
- [ESPnet2] Support torch_optimizer #1739 by @kamo-naoyuki
- [ESPnet2] Log rotation for launch.py #1737 by @kamo-naoyuki
- [ESPnet2] Change the type of --chunk_length to str_or_int #1733 by @kamo-naoyuki
- [ESPnet2] Change cudnn deterministic mode to default #1732 by @kamo-naoyuki
- [ESPnet2] Add wsj results for espnet2 #1724 by @kamo-naoyuki
- [ESPnet2] Show estimated time to finish #1717 by @kamo-naoyuki
- [ESPnet2] Add --name option for training job #1714 by @kamo-naoyuki
- [ESPnet2] Show the log file when training process is failed: espnet2.bin.launch.py #1713 by @kamo-naoyuki
- [ESPnet2] --max_length -> --fold_length #1712 by @kamo-naoyuki
- [ESPnet2] Double quoter for NCCL_SOCKET_IFNAME #1706 by @kamo-naoyuki
- [ESPnet2] Save apex state in checkpoint and support apex optimizer #1705 by @kamo-naoyuki
- [ESPnet2] Update asr.sh #1694 by @zh794390558
- [ESPnet2] Update ctc.py #1688 by @zh794390558
- [ESPnet2] Update launch.py #1681 by @zh794390558
- [ESPnet2] Update asr.sh #1678 by @zh794390558
- [ESPnet2] --keep_n_best_checkpoints -> --keep_nbest_models #1647 by @kamo-naoyuki
- [ESPnet2] Avoid deprecated warning: reduction="none" #1510 by @kamo-naoyuki
- [ESPnet2] Minor change for speed perturbation #1627 by @kamo-naoyuki
- [ESPnet2] Fix how2 recipe #1620 by @kamo-naoyuki
- [ESPnet2] Fix recipes #1617 by @kamo-naoyuki
- [ESPnet2] Renaming #1610 by @kamo-naoyuki
- [ESPnet2] Implement chunk iterator #1608 by @kamo-naoyuki
- [ESPnet2] Update voxforge RESULTS #1601 by @kamo-naoyuki
- [ESPnet2] vivos recipe: --audio_format wav #1592 by @kamo-naoyuki
- [ESPnet2] Lower python requirements to 3.6 #1565 by @kamo-naoyuki
- [ESPnet2] dirha_wsj recipe for espnet2 #1556 by @yuekaizhang
- [ESPnet2] Update AISHELL ASR Recipe #1549 by @Emrys365
- [ESPnet2] Remove short data #1531 by @kamo-naoyuki
- [ESPnet2] [WIP] Update JSUT ASR Recipe #1529 by @YosukeHiguchi
- [ESPnet2] Update HOW2 recipe #1522 by @b-flo
- [ESPnet2] [WIP] Update CSJ ASR Recipe #1520 by @YosukeHiguchi
- [ESPnet2] Change NoamLR to deprecated and implement WarmupLR #1519 by @kamo-naoyuki
- [ESPnet2] Implement --max_cache_size option #1509 by @kamo-naoyuki
- [ESPnet2] distributed training #1506 by @kamo-naoyuki
- [ESPnet2] ESPNet2 Recipe Update -- commonvoice, babel, ami #1504 by @ftshijt
- [ESPnet2] Refactoring #1494 by @kamo-naoyuki
- [ESPnet2] Fix ci of flake8 part #1491 by @kamo-naoyuki
- [ESPnet2] Tensorboard, --num_iters_per_epoch, etc. #1487 by @kamo-naoyuki
- [ESPnet2] Fix espnet2.bin.pack #1486 by @kamo-naoyuki
- [ESPnet2] show_result.sh #1478 by @kamo-naoyuki
- [ESPnet2] Pack and Unpack model #1477 by @kamo-naoyuki
- [ESPnet2] collect-stats mode, trainer class, etc. #1462 by @kamo-naoyuki
- [ESPnet2] add test codes for asr decoders #1445 by @kamo-naoyuki
- [ESPnet2] Integrate Griffin-Lim with tts_decode() #1442 by @kan-bayashi
- [ESPnet2] Update ASR recipe #1439 by @kan-bayashi
- [ESPnet2] Update TTS recipes #1430 by @kan-bayashi
- [ESPnet2] Disable wer/cer calculation when training #1547 by @kamo-naoyuki
- [ESPnet2] Change CTC default to builtin #1546 by @kamo-naoyuki
- [ESPnet2] Update chime4 asr1 Recipe #1570 by @yuekaizhang
- [ESPnet2] Create documentation for espnet2 #1710 by @kamo-naoyuki
- [ESPnet2] shellcheck for local/data.sh #1524 by @kamo-naoyuki
- [ESPnet2] commonvoice: RESULTS.md -> README.md #1797 by @kamo-naoyuki
Bugfix
- [Bugfix] % -> percent: espnet2/tasks/abs_task.py #1767 by @kamo-naoyuki
- [Bugfix] Fix gpu mode for tts_inference.py #1755 by @kamo-naoyuki
- [Bugfix] Fix SubReporter #1748 by @kamo-naoyuki
- [Bugfix] Fix calculate_all_attentions for espnet2 #1747 by @kamo-naoyuki
- [Bugfix] Not to create the averaged mdel if --keep_nbest_models=1 #1744 by @kamo-naoyuki
- [Bugfix] Fix --best_model_criterions #1743 by @kamo-naoyuki
- [Bugfix] Fix the gpu device when resuming #1731 by @kamo-naoyuki
- [Bugfix] Fix error log for espnet2/bin/launch.py #1730 by @kamo-naoyuki
- [Bugfix] Disable CUDNN deterministic for CTC: espnet2/asr/ctc.py #1720 by @kamo-naoyuki
- [Bugfix] Update default.py #1698 by @zh794390558
- [Bugfix] Fix chunk iterator and refactoring for distributed training #1685 by @kamo-naoyuki
- [Bugfix] Update vgg_rnn_encoder.py #1676 by @zh794390558
- [Bugfix] [ESPnet2] chmod +x: run.sh for JSUT #1628 by @kamo-naoyuki
- [Bugfix] [ESPnet2]Remove nlsyms when word scoring #1614 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix setup.sh #1596 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix launch.py for slurm #1588 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix ci for local/data.sh #1572 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix nj of scripts/audio/format_wav_scp.sh #1550 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Use load_scp_sequential in formart_wav_scp.py #1541 by @kamo-naoyuki
- [Bugfix] [ESPNet2] Minor fix for CSJ recipe #1540 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix transformer #1539 by @kamo-naoyuki
- [Bugfix] [ESPnet2] fix rnn_type when bidirectional is used #1533 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix format_wav_scp.py #1532 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix bug of using GPU even if CPU mode #1526 by @kamo-naoyuki
- [Bugfix] [ESPnet2 ] Fix --accum_grad #1525 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix voxforge config #1511 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Bug fix of splitting files for collect_stats mode #1505 by @kamo-naoyuki
- [Bugfix] fix to use queue.conf #1431 by @sw005320
- [Bugfix] [ESPnet2] Fix a bug in TTS #1428 by @kan-bayashi
- [Bugfix] [ESPnet2] Refactor Encoder and Decoder and bug fix #1427 by @kamo-naoyuki
- [Bugfix] [ESPnet2] Fix bug of text-chars converter #1426 by @kamo-naoyuki
- [Bugfix] Optionize trans_type in egs/ljspeech/tts2 #1789 by @kan-bayashi
- [Bugfix] bugfix in ljspeech/tts2 #1783 by @beckgom
- [Bugfix] missing argument for local/data_prep.sh added #1782 by @beckgom
- [Bugfix] avoid sentencepiece==0.1.90 #1923 by @kamo-naoyuki
- [Bugfix] FIX E523,E541,E741 #1918 by @kamo-naoyuki
- [Bugfix] fix reverse option for cmvn #1906 by @magictron
- [Bugfix] Error handling for Transformer with CTC-based VAD #1875 by @takenori-y
- [Bugfix] Revert deletion of init files #1842 by @Fhrozen
- [Bugfix] fix the missing link of tedlium3 #1841 by @sw005320
- [Bugfix] Add test for torch>1.1 #1840 by @kamo-naoyuki
- [Bugfix] Fix #1808: change the argument order of --batch_type for collect stat… #1810 by @kamo-naoyuki
- [Bugfix] Change to configargparse>=1.2.1 #1803 by @kamo-naoyuki
- [Bugfix] typo fixed for attention type #1793 by @beckgom
- [Bugfix] fix #1780 #1784 by @qmeeus
- [Bugfix] Fix bug of espnet2 asr_inference.py #1952 by @kamo-naoyuki
- [Bugfix] Minor fix of import place and comments #1959 by @kan-bayashi
New Features
- [New Features] Add utils/translate_wav.sh #1530 by @ShigekiKarita
- [New Features] Batch beam search V2 for Transformer (no CTC) #1402 by @ShigekiKarita
Enhancement
- [Enhancement] Support multiple sentences in synth_wav.sh #1788 by @kan-bayashi
- [Enhancement] fix+update transducer #1760 by @b-flo
Documentation
- [Documentation] Update notebook #1963 by @kan-bayashi
- [Documentation] Update installation manual #1960 by @kan-bayashi
- [Documentation] Update installation.md #1957 by @kamo-naoyuki
- [Documentation] Add note in synth_wav.sh #1785 by @kan-bayashi
- [Documentation] Update docs #1954 #1955 by @kamo-naoyuki
- [Documentation] Update docs #1938 by @kamo-naoyuki
- [Documentation] docs: added fbank link to the experiment readme #1910 by @kdubovikov
Recipe
- [Recipe] Added some TIMIT results #1819 by @sknadig
- [Recipe] add recipe for French Polyphone: ELRA-S0030_02 #1711 by @AdolfVonKleist
- [Recipe] Use espnet_tts_frontend #1794 by @kamo-naoyuki
CI
- [CI] Use cache in actions #1917 by @ShigekiKarita
- [CI] Apply black #1850 by @kamo-naoyuki
- [CI] Create .mergify.yml #1813 by @kamo-naoyuki
Acknowledgements
Special thanks to @AdolfVonKleist, @Emrys365, @Fhrozen, @ShigekiKarita, @YosukeHiguchi, @beckgom, @b-flo, @ftshijt, @kamo-naoyuki, @kan-bayashi, @kdubovikov, @magictron, @qmeeus, @sknadig, @sw005320, @takenori-y, @yuekaizhang, @zh794390558