23 Jul 09:04

kan-bayashi

74a2ca3

ESPnet Version 0.4.2

Bugfix

[Bugfix] Fix pytorch LM GPU training without cupy #981
[Bugfix] make tensorboard logging done every 100 iters #966
[Bugfix] FiX ER calculator #955
[Bugfix] Fix a typo bug in computing guided attention loss #956
[Bugfix] run.sh should exit if sourcing path.sh return error #954

Recipe

[Recipe] Update Librispeech recipe #970
[Recipe] New RNN and Transformer result of AMI recipe(ihm) #978
[Recipe] BPE support for SwitchBoard & Transformer config #909
[Recipe] Update li10 #965
[Recipe] Update libri trans #949

Enhancement

[Enhancement] transform: expose pad_mode for logmelspectrogram #957

Acknowledgements

Special thanks to @Fhrozen, @geekboood, @hirofumi0810, @Jzmo, @naxingyu, @r9y9, @ShigekiKarita.

Assets 2

03 Jul 11:41

kan-bayashi

v.0.4.1

9f0729f

ESPnet Version 0.4.1

Bugfix

[Bugfix] Fix a bug in calculate_all_attentions #862
[Bugfix] Fix bugs in frontend #875
[Bugfix] Fix grad noise v2 #912
[Bugfix] Fix plot fail #913
[Bugfix] Fix tgz typo #892
[Bugfix] Fix: Output dimension of Conv2dSubsampling #822 #921
[Bugfix] Fix: espnet/transform/transformation.py #866
[Bugfix] Fixed certain typos #893
[Bugfix] Modified if conditions #908
[Bugfix] fix bugs in grad noise #886
[Bugfix] CER/WER & CER_CTC in Transformer pytorch #936
[Bugfix] Update iwslt18 recipe #808

Documentation

[Documentation] Add model link #899
[Documentation] Document espnet tools and modules #884
[Documentation] Fix typo #930
[Documentation] Reformat docstrings in espnet/asr #914
[Documentation] Update CONTRIBUTING.md #880
[Documentation] add recipe related documentations to CONTRIBUTING.md #872
[Documentation] skip ci when gh-pages is deployed #901
[Documentation] use only conda to build doc #895

Enhancement

[Enhancement] Script for docker builds from the local repo #877
[Enhancement] Demo script for TTS #871
[Enhancement] Fix plot attention for chainer transformer #940
[Enhancement] Implement Fast Speech #848
[Enhancement] Move the dependency links to github from Makefile to setup.py #858
[Enhancement] Support new version in Docker containers #836
[Enhancement] gradient noise injection from std normal dis #881
[Enhancement] [Discussion] Create show_result.sh #874

Recipe

[Recipe] Add Jsut asr recipe #793
[Recipe] AURORA4 RESULTS.md file #835
[Recipe] Add Librispeech French corpus #882
[Recipe] Add transformer config in m_ailabs/tts1 recipe #924
[Recipe] Change librispeech_french to libri_trans #903
[Recipe] Fix: utils/show_result.sh #915
[Recipe] Minor update for speech translation recipe #907
[Recipe] Transformer for CHiME4 Single Channel #837
[Recipe] Update LJSpeech RESULTS.md #861
[Recipe] Update LJSpeech RESULTS.md #887
[Recipe] Update Librispeech recipe #885
[Recipe] Update fisher callhome spanish for speech translation #868
[Recipe] libri_trans NMT recipe #931

Refactoring

[Refactoring] Refactor TTS Transformer #865
[Refacotring] test: avoid using grep and sed in subprocess and use python stdlib instead #854
[Refactoring] Update TTS module’s docstrings and refactor some modules #898

Acknowledgements

Special thanks to @27jiangziyan, @Fhrozen, @Masao-Someki, @ShigekiKarita, @SuperGops7, @creatorscan, @hirofumi0810, @kamo-naoyuki, @lumaku, @naxingyu, @r9y9, @simpleoier, @takenori-y.

Assets 2

15 Jun 02:22

kan-bayashi

v.0.4.0

0306f68

ESPnet Version 0.4.0

New features and improvements

E2E Mulchi channels system #596
- Changed to use pip-install for pytorch_wpe #843
Transformer
- ASR chainer #655
- ASR pytorch #690
- TTS pytorch #752
Specaugment #734 #745 #754
Streaming attention encoder-decdoer E2E-ASR #757
- Offline recognition demo #809
New batch making strategies #759
Guided Attention Loss #816

Important changes

drop python2 support
use utils/fix_data_dir.sh as default #660
CPU-only installation #677 #687 #704
fix to use python2 as default in travis #685
add CUDA_VERSION in Makefile #687
use Pytorch 1.0.1 as default #721
use yaml format configuration file #722
modularize TTS components #746 #815
use Chainer/Cupy 6.0.0 as default #753
reinforce CI #763
Google drive downloader #798
New scripts to pack model and get system info #790 #802
change the scoring in multi-speaker case from shell to python #805
update patience in TTS recipes #817
n_average option in TTS #823
- update TTS recipes to use config files #780
make ngpu=1 as default for all of the recipes #800
deprecate egs/librispeech/tts1 recipe #806
maintain the pytorch warp-ctc under espnet #838

New recipes

AURORA4 #722 #770 #824
JNAS #725
LibriTTS #795
Tedlium release3 #739
- added the model link and missing files #831
TIMIT #698
Russian Open STT #768

Recipe updates

Aishell
- support Transformer #827
- fix the indent of RESULTS.md in the aishell recipe #828
CSJ
- support Transformer #737 #742 #782
HKUST
- support Transformer #840
IWSLT18
- add missing files for iwslt18 recipe #767
Librispeech
- support Transformer #781
LJSpeech
- added more samples #825 #842
- support Transformer #752
Tedlium release2
- support word LM in TEDLIUM recipe #683
- fix duplicated line in tedlium recipe #714
- fix a bug in the TEDLIUM recipe #771
- support Transformer #803
Voxforge
- bugfix in voxforge #684
- unify rnn and transformer recipes for the voxforge task #769
- support Transformer #758
- update config files in the voxforge recipe #783
WSJ
- support Specaugment #745
- support Transformer #655 #690

Documentation

add citation bibtex entry for ESPnet #676
add NACCL paper repliation link for CMU Wilderness Multilingual Speech Dataset #717 #731
update library information #789
Add table of contents #812
add GPU decoding document Documentation #813
minibatch explanation #821

Bugfix

fix recognize_batch for 2d, location_reccurent, multi-head attentions for #665 and add test #681
fix CER/WER calculation during training #678
add version check for matplotlib installation #679
make sure hlens is tensor in recognize_batch #680
fix choice between pytorch and pytorch-cpu #702
fix merge_json behavior (#699) when no labels for #708
fix check_install.py #728
use ensure_ascii=False to make json human-readable #730
Fix argument name for SummaryWriter #747
use scikit-learn 0.20 #749
fix pytorch for chainer v6.0.0 #772
fix model compatibility #799
fix minor typos in the recipes #801
bug fix: egs/chime4/asr1_multich/conf/train.yaml #826
bug fix: espnet/utils/training/batchfy.py #833
fix to use sentencepiece v.0.1.82 #839

Acknowledegements

Special thanks to @27jiangziyan, @akreal, @bobchennan, @creatorscan, @danoneata, @Fhrozen, @gtache, @hirofumi0810, @jan-schuchardt, @jnishi, @kamo-naoyuki, @Masao-Someki, @oadams, @simpleoier, @sknadig, @ShigekiKarita, @takenori-y

Assets 2

16 Mar 04:38

kan-bayashi

v.0.3.1

5454b19

ESPnet Version 0.3.1 (stable)

New improvements

Add instant speech recognition #581
Add CTC greedy decoding CER monitor #587
Add Streaming encoder #638
Add Uni-directional encoder #624 #629
Add model compatibility test #615 #649
Update fisher_callhome_spanish recipe #625
Improve swbd scoring #614 #620
Improve memory usage in json merge script #579
Improve background job failure check in decoding state #627 #643 #648
Separate installation of basic tools and extra tools #628

Bugfix

Fix CTC type selection #617 #618
Fix MultiProcessIterator #613
Fix chainer sortgrad bug
Fix installer #594 #595 #604 #609 #622
Fix WSJ-mix recipe #610 #630 #641
Fix remove_longshortdata.sh #646

Thank you for a lot of contributions @kamo-naoyuki, @gtache, @simpleoier, @takenori-y, @Fhrozen, @JaejinCho, @pzelasko, @zh794390558, @kan-bayashi, @sw005320.

Assets 2

16 Feb 03:02

kan-bayashi

v.0.3.0-beta

8e5c301

ESPnet v.0.3.0 beta Pre-release

Pre-release

New features and improvements

Support Pytorch 1.0 #553
Support the use of Tensorboard #506
Support early stopping #508
Support stop_stage option #539
Support sortgrad #550
Add GRU architecture #496
Add GPU batch decoding #318
Support HDF5 format instead of kaldi ark #412 #493
Add speech separation recipe #531
Add TTS recipes (German, Spanish, Italy, Japanese...) #562 #569 #519
Add ASR recipes #574 #519
Improve ASR recipes #491 #521 #546 #435 #467 #469
Improve speech translation recipes #468
Improve Python2/3 compatibility #567
Improve cmd.sh usage #538 #547
Add test scripts for shell scripts #484 #498
Change to use conda with Python3.7 as default #567
Python code modularization #440 #484

We really appreciate a lot of contributions, @gtache, @kamo-naoyuki, @hirofumi0810, @ShigekiKarita, @takenori-y, @simpleoier, @Fhrozen, @sas91, @mn5k, @JaejinCho. @Xiaofei-Wang, @jnishi, @Magic-Bubble.

Assets 2

30 Aug 01:09

kan-bayashi

v.0.2.0

0693ffc

ESPnet v.0.2.0 (Major update) Pre-release

Pre-release

New feature and improvement

add data prefetch #340
add new recipes
- IWSLT speech translation recipe #325
- REVERB challenge recipe #359
add test codes
- for checking warp ctc behaviors in the multitask mode #369
- for a multiple GPU #362
- for a single GPU #376
- for read/write models #362 #376
add check script for python library installation #373 #389
improve some ASR baseline recipes by using a shallow and wide BLSTM encoder and subwords
- librispeech #354 #386
- CSJ #326
- HKUST #366

Important changes

fix to use PyTorch 0.4.1 (stop to support PyTorch 0.3.x) #332
rename some functions
- e2e_asr_attctc.py -> e2e_asr.py
- e2e_asr_attctc_th.py -> e2e_asr_th.py
change the format of model.conf from pickle to JSON #342
remove deprecated options #336
unify the data converter with TTS one #343
unify model variable arguments between TTS and ASR #337
fix pytorch backend snapshot functions including the save of optimizers #362
avoid to use feat-to-len. Use write_utt2num_frames=true, and read utt2num instead of executing feat-to-len #339
refacor asr_pytorch.py and asr_chainer.py.
- refactor the recog part in asr_chainer.py and asr_pytorch especially after it gets nbest. #370
- make nets/e2e_common.py, and move some common functions there

Bug fix

warpctc gradient scaling (Thanks @jnishi)
warpctc multi-gpu bug (Thanks @jnishi)
undefined gpuid bug in cpu RNN training #379
no hypothesis bug #378
Python3 compatibility #375 #341 (Thanks @akreal)

Assets 2

24 Aug 13:26

sw005320

v.0.1.5

168a9e9

ESPnet v.0.1.5 (minor update) Pre-release

Pre-release

update the Librispeech ASR recipe and use subword modeling as default.
attached Librispeech ASR model (librispeech_asr1.tgz):
- RNNLM: exp/train_rnnlm_2layer_bs256_unigram2000/rnnlm.model.best
- ASR models: exp/train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs30_mli800_mlo150_unigram2000/results/{model.acc.best,model.conf}
- performance:

	WER (%)
Librispeech dev_clean	5.0
Librispeech test_clean	5.0

- when we use the above models, please insert the ASR model directory (expdir) and RNNLM model directory (lmexpdir) in run.sh as follows:

expdir=exp/train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs30_mli800_mlo150_unigram2000
lmexpdir=exp/train_rnnlm_2layer_bs256_unigram2000

        ${decode_cmd} JOB=1:${nj} ${expdir}/${decode_dir}/log/decode.JOB.log \
            asr_recog.py \
            --ngpu ${ngpu} \
            --backend ${backend} \
            --recog-json ${feat_recog_dir}/split${nj}utt/data_${bpemode}${nbpe}.JOB.json \
            --result-label ${expdir}/${decode_dir}/data.JOB.json \
            --model ${expdir}/results/model.${recog_model}  \
            --model-conf ${expdir}/results/model.conf  \
            --beam-size ${beam_size} \
            --penalty ${penalty} \
            --maxlenratio ${maxlenratio} \
            --minlenratio ${minlenratio} \
            --ctc-weight ${ctc_weight} \
            --rnnlm ${lmexpdir}/rnnlm.model.best \
            --lm-weight ${lm_weight} \

Assets 3

24 Aug 02:42

sw005320

v.0.1.4

2932e97

ESPnet v.0.1.4 Pre-release

Pre-release

Added TTS recipe based on Tacotron2 egs/ljspeech/tts1
Extended the above TTS recipe to multispeaker TTS egs/librispeech/tts1/
Supported PyTorch 0.4.0
Added word level decoding
(Finally) fixed CNN (VGG) layer issues in PyTorch
Fixed warp CTC scaling issues in PyTorch
Added subword modeling based on sentence piece toolkit
Many bug fix
Updated CSJ performance

Assets 2

20 Jun 12:35

sw005320

v.0.1.3

b2420d8

stable version for jsalt18 summer school Pre-release

Pre-release

bug fix
improve the jsalt18e2e recipe
improve the JSON format
simplify Makefile

Assets 2

12 Jun 23:55

sw005320

v.0.1.2

c044f7a

Change JSON format and use feature compression Pre-release

Pre-release

change the JSON format to deal with multiple inputs and outputs
use feature compression to reduce the data I/O

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix

Recipe

Enhancement

Acknowledgements

Bugfix

Documentation

Enhancement

Recipe

Refactoring

Acknowledgements

New features and improvements

Important changes

New recipes

Recipe updates

Documentation

Bugfix

Acknowledegements

New improvements

Bugfix

New features and improvements

New feature and improvement

Important changes

Bug fix

Releases: espnet/espnet

ESPnet Version 0.4.2

Bugfix

Recipe

Enhancement

Acknowledgements

ESPnet Version 0.4.1

Bugfix

Documentation

Enhancement

Recipe

Refactoring

Acknowledgements

ESPnet Version 0.4.0

New features and improvements

Important changes

New recipes

Recipe updates

Documentation

Bugfix

Acknowledegements

ESPnet Version 0.3.1 (stable)

New improvements

Bugfix

ESPnet v.0.3.0 beta

New features and improvements

ESPnet v.0.2.0 (Major update)

New feature and improvement

Important changes

Bug fix

ESPnet v.0.1.5 (minor update)

ESPnet v.0.1.4

stable version for jsalt18 summer school

Change JSON format and use feature compression