Skip to content

Releases: espnet/espnet

ESPnet Version 0.4.2

23 Jul 09:04
74a2ca3
Compare
Choose a tag to compare

Bugfix

  • [Bugfix] Fix pytorch LM GPU training without cupy #981
  • [Bugfix] make tensorboard logging done every 100 iters #966
  • [Bugfix] FiX ER calculator #955
  • [Bugfix] Fix a typo bug in computing guided attention loss #956
  • [Bugfix] run.sh should exit if sourcing path.sh return error #954

Recipe

  • [Recipe] Update Librispeech recipe #970
  • [Recipe] New RNN and Transformer result of AMI recipe(ihm) #978
  • [Recipe] BPE support for SwitchBoard & Transformer config #909
  • [Recipe] Update li10 #965
  • [Recipe] Update libri trans #949

Enhancement

  • [Enhancement] transform: expose pad_mode for logmelspectrogram #957

Acknowledgements

Special thanks to @Fhrozen, @geekboood, @hirofumi0810, @Jzmo, @naxingyu, @r9y9, @ShigekiKarita.

ESPnet Version 0.4.1

03 Jul 11:41
9f0729f
Compare
Choose a tag to compare

Bugfix

  • [Bugfix] Fix a bug in calculate_all_attentions #862
  • [Bugfix] Fix bugs in frontend #875
  • [Bugfix] Fix grad noise v2 #912
  • [Bugfix] Fix plot fail #913
  • [Bugfix] Fix tgz typo #892
  • [Bugfix] Fix: Output dimension of Conv2dSubsampling #822 #921
  • [Bugfix] Fix: espnet/transform/transformation.py #866
  • [Bugfix] Fixed certain typos #893
  • [Bugfix] Modified if conditions #908
  • [Bugfix] fix bugs in grad noise #886
  • [Bugfix] CER/WER & CER_CTC in Transformer pytorch #936
  • [Bugfix] Update iwslt18 recipe #808

Documentation

  • [Documentation] Add model link #899
  • [Documentation] Document espnet tools and modules #884
  • [Documentation] Fix typo #930
  • [Documentation] Reformat docstrings in espnet/asr #914
  • [Documentation] Update CONTRIBUTING.md #880
  • [Documentation] add recipe related documentations to CONTRIBUTING.md #872
  • [Documentation] skip ci when gh-pages is deployed #901
  • [Documentation] use only conda to build doc #895

Enhancement

  • [Enhancement] Script for docker builds from the local repo #877
  • [Enhancement] Demo script for TTS #871
  • [Enhancement] Fix plot attention for chainer transformer #940
  • [Enhancement] Implement Fast Speech #848
  • [Enhancement] Move the dependency links to github from Makefile to setup.py #858
  • [Enhancement] Support new version in Docker containers #836
  • [Enhancement] gradient noise injection from std normal dis #881
  • [Enhancement] [Discussion] Create show_result.sh #874

Recipe

  • [Recipe] Add Jsut asr recipe #793
  • [Recipe] AURORA4 RESULTS.md file #835
  • [Recipe] Add Librispeech French corpus #882
  • [Recipe] Add transformer config in m_ailabs/tts1 recipe #924
  • [Recipe] Change librispeech_french to libri_trans #903
  • [Recipe] Fix: utils/show_result.sh #915
  • [Recipe] Minor update for speech translation recipe #907
  • [Recipe] Transformer for CHiME4 Single Channel #837
  • [Recipe] Update LJSpeech RESULTS.md #861
  • [Recipe] Update LJSpeech RESULTS.md #887
  • [Recipe] Update Librispeech recipe #885
  • [Recipe] Update fisher callhome spanish for speech translation #868
  • [Recipe] libri_trans NMT recipe #931

Refactoring

  • [Refactoring] Refactor TTS Transformer #865
  • [Refacotring] test: avoid using grep and sed in subprocess and use python stdlib instead #854
  • [Refactoring] Update TTS module’s docstrings and refactor some modules #898

Acknowledgements

Special thanks to @27jiangziyan, @Fhrozen, @Masao-Someki, @ShigekiKarita, @SuperGops7, @creatorscan, @hirofumi0810, @kamo-naoyuki, @lumaku, @naxingyu, @r9y9, @simpleoier, @takenori-y.

ESPnet Version 0.4.0

15 Jun 02:22
0306f68
Compare
Choose a tag to compare

New features and improvements

  • E2E Mulchi channels system #596
    • Changed to use pip-install for pytorch_wpe #843
  • Transformer
  • Specaugment #734 #745 #754
  • Streaming attention encoder-decdoer E2E-ASR #757
    • Offline recognition demo #809
  • New batch making strategies #759
  • Guided Attention Loss #816

Important changes

  • drop python2 support
  • use utils/fix_data_dir.sh as default #660
  • CPU-only installation #677 #687 #704
  • fix to use python2 as default in travis #685
  • add CUDA_VERSION in Makefile #687
  • use Pytorch 1.0.1 as default #721
  • use yaml format configuration file #722
  • modularize TTS components #746 #815
  • use Chainer/Cupy 6.0.0 as default #753
  • reinforce CI #763
  • Google drive downloader #798
  • New scripts to pack model and get system info #790 #802
  • change the scoring in multi-speaker case from shell to python #805
  • update patience in TTS recipes #817
  • n_average option in TTS #823
    • update TTS recipes to use config files #780
  • make ngpu=1 as default for all of the recipes #800
  • deprecate egs/librispeech/tts1 recipe #806
  • maintain the pytorch warp-ctc under espnet #838

New recipes

Recipe updates

  • Aishell
    • support Transformer #827
    • fix the indent of RESULTS.md in the aishell recipe #828
  • CSJ
  • HKUST
    • support Transformer #840
  • IWSLT18
    • add missing files for iwslt18 recipe #767
  • Librispeech
    • support Transformer #781
  • LJSpeech
  • Tedlium release2
    • support word LM in TEDLIUM recipe #683
    • fix duplicated line in tedlium recipe #714
    • fix a bug in the TEDLIUM recipe #771
    • support Transformer #803
  • Voxforge
    • bugfix in voxforge #684
    • unify rnn and transformer recipes for the voxforge task #769
    • support Transformer #758
    • update config files in the voxforge recipe #783
  • WSJ

Documentation

  • add citation bibtex entry for ESPnet #676
  • add NACCL paper repliation link for CMU Wilderness Multilingual Speech Dataset #717 #731
  • update library information #789
  • Add table of contents #812
  • add GPU decoding document Documentation #813
  • minibatch explanation #821

Bugfix

  • fix recognize_batch for 2d, location_reccurent, multi-head attentions for #665 and add test #681
  • fix CER/WER calculation during training #678
  • add version check for matplotlib installation #679
  • make sure hlens is tensor in recognize_batch #680
  • fix choice between pytorch and pytorch-cpu #702
  • fix merge_json behavior (#699) when no labels for #708
  • fix check_install.py #728
  • use ensure_ascii=False to make json human-readable #730
  • Fix argument name for SummaryWriter #747
  • use scikit-learn 0.20 #749
  • fix pytorch for chainer v6.0.0 #772
  • fix model compatibility #799
  • fix minor typos in the recipes #801
  • bug fix: egs/chime4/asr1_multich/conf/train.yaml #826
  • bug fix: espnet/utils/training/batchfy.py #833
  • fix to use sentencepiece v.0.1.82 #839

Acknowledegements

Special thanks to @27jiangziyan, @akreal, @bobchennan, @creatorscan, @danoneata, @Fhrozen, @gtache, @hirofumi0810, @jan-schuchardt, @jnishi, @kamo-naoyuki, @Masao-Someki, @oadams, @simpleoier, @sknadig, @ShigekiKarita, @takenori-y

ESPnet Version 0.3.1 (stable)

16 Mar 04:38
5454b19
Compare
Choose a tag to compare

New improvements

  • Add instant speech recognition #581
  • Add CTC greedy decoding CER monitor #587
  • Add Streaming encoder #638
  • Add Uni-directional encoder #624 #629
  • Add model compatibility test #615 #649
  • Update fisher_callhome_spanish recipe #625
  • Improve swbd scoring #614 #620
  • Improve memory usage in json merge script #579
  • Improve background job failure check in decoding state #627 #643 #648
  • Separate installation of basic tools and extra tools #628

Bugfix

Thank you for a lot of contributions @kamo-naoyuki, @gtache, @simpleoier, @takenori-y, @Fhrozen, @JaejinCho, @pzelasko, @zh794390558, @kan-bayashi, @sw005320.

ESPnet v.0.3.0 beta

16 Feb 03:02
8e5c301
Compare
Choose a tag to compare
ESPnet v.0.3.0 beta Pre-release
Pre-release

New features and improvements

  • Support Pytorch 1.0 #553
  • Support the use of Tensorboard #506
  • Support early stopping #508
  • Support stop_stage option #539
  • Support sortgrad #550
  • Add GRU architecture #496
  • Add GPU batch decoding #318
  • Support HDF5 format instead of kaldi ark #412 #493
  • Add speech separation recipe #531
  • Add TTS recipes (German, Spanish, Italy, Japanese...) #562 #569 #519
  • Add ASR recipes #574 #519
  • Improve ASR recipes #491 #521 #546 #435 #467 #469
  • Improve speech translation recipes #468
  • Improve Python2/3 compatibility #567
  • Improve cmd.sh usage #538 #547
  • Add test scripts for shell scripts #484 #498
  • Change to use conda with Python3.7 as default #567
  • Python code modularization #440 #484

We really appreciate a lot of contributions, @gtache, @kamo-naoyuki, @hirofumi0810, @ShigekiKarita, @takenori-y, @simpleoier, @Fhrozen, @sas91, @mn5k, @JaejinCho. @Xiaofei-Wang, @jnishi, @Magic-Bubble.

ESPnet v.0.2.0 (Major update)

30 Aug 01:09
0693ffc
Compare
Choose a tag to compare
Pre-release

New feature and improvement

  • add data prefetch #340
  • add new recipes
    • IWSLT speech translation recipe #325
    • REVERB challenge recipe #359
  • add test codes
    • for checking warp ctc behaviors in the multitask mode #369
    • for a multiple GPU #362
    • for a single GPU #376
    • for read/write models #362 #376
  • add check script for python library installation #373 #389
  • improve some ASR baseline recipes by using a shallow and wide BLSTM encoder and subwords

Important changes

  • fix to use PyTorch 0.4.1 (stop to support PyTorch 0.3.x) #332
  • rename some functions
    • e2e_asr_attctc.py -> e2e_asr.py
    • e2e_asr_attctc_th.py -> e2e_asr_th.py
  • change the format of model.conf from pickle to JSON #342
  • remove deprecated options #336
  • unify the data converter with TTS one #343
  • unify model variable arguments between TTS and ASR #337
  • fix pytorch backend snapshot functions including the save of optimizers #362
  • avoid to use feat-to-len. Use write_utt2num_frames=true, and read utt2num instead of executing feat-to-len #339
  • refacor asr_pytorch.py and asr_chainer.py.
    • refactor the recog part in asr_chainer.py and asr_pytorch especially after it gets nbest. #370
    • make nets/e2e_common.py, and move some common functions there

Bug fix

  • warpctc gradient scaling (Thanks @jnishi)
  • warpctc multi-gpu bug (Thanks @jnishi)
  • undefined gpuid bug in cpu RNN training #379
  • no hypothesis bug #378
  • Python3 compatibility #375 #341 (Thanks @akreal)

ESPnet v.0.1.5 (minor update)

24 Aug 13:26
168a9e9
Compare
Choose a tag to compare
Pre-release
  • update the Librispeech ASR recipe and use subword modeling as default.
  • attached Librispeech ASR model (librispeech_asr1.tgz):
    • RNNLM: exp/train_rnnlm_2layer_bs256_unigram2000/rnnlm.model.best
    • ASR models: exp/train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs30_mli800_mlo150_unigram2000/results/{model.acc.best,model.conf}
    • performance:
WER (%)
Librispeech dev_clean 5.0
Librispeech test_clean 5.0
    • when we use the above models, please insert the ASR model directory (expdir) and RNNLM model directory (lmexpdir) in run.sh as follows:
expdir=exp/train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs30_mli800_mlo150_unigram2000
lmexpdir=exp/train_rnnlm_2layer_bs256_unigram2000

        ${decode_cmd} JOB=1:${nj} ${expdir}/${decode_dir}/log/decode.JOB.log \
            asr_recog.py \
            --ngpu ${ngpu} \
            --backend ${backend} \
            --recog-json ${feat_recog_dir}/split${nj}utt/data_${bpemode}${nbpe}.JOB.json \
            --result-label ${expdir}/${decode_dir}/data.JOB.json \
            --model ${expdir}/results/model.${recog_model}  \
            --model-conf ${expdir}/results/model.conf  \
            --beam-size ${beam_size} \
            --penalty ${penalty} \
            --maxlenratio ${maxlenratio} \
            --minlenratio ${minlenratio} \
            --ctc-weight ${ctc_weight} \
            --rnnlm ${lmexpdir}/rnnlm.model.best \
            --lm-weight ${lm_weight} \

ESPnet v.0.1.4

24 Aug 02:42
2932e97
Compare
Choose a tag to compare
ESPnet v.0.1.4 Pre-release
Pre-release
  • Added TTS recipe based on Tacotron2 egs/ljspeech/tts1
  • Extended the above TTS recipe to multispeaker TTS egs/librispeech/tts1/
  • Supported PyTorch 0.4.0
  • Added word level decoding
  • (Finally) fixed CNN (VGG) layer issues in PyTorch
  • Fixed warp CTC scaling issues in PyTorch
  • Added subword modeling based on sentence piece toolkit
  • Many bug fix
  • Updated CSJ performance

stable version for jsalt18 summer school

20 Jun 12:35
b2420d8
Compare
Choose a tag to compare
  • bug fix
  • improve the jsalt18e2e recipe
  • improve the JSON format
  • simplify Makefile

Change JSON format and use feature compression

12 Jun 23:55
c044f7a
Compare
Choose a tag to compare
  • change the JSON format to deal with multiple inputs and outputs
  • use feature compression to reduce the data I/O