jsut-label

HTS-style context labels of JSUT corpus available for speech synthesis system such as HTS, Merlin, and nnmnkwii. Phonetic and prosodic information are based on manual annotation. Time information are automatically estimated using Julius. Currently, this repository provides the labels of the BASIC5000 subset. Also, the pronounced texts and kanas are listed in ./text_kana. Input sequences available for end-to-end speech synthesis are provided in ./e2e_symbol.

Notice

The context labels are not completely the same format as those created using OpenJTalk. The followings are NOT supported in the labels.

Unvoiced vowels, which are genrally annotated as A, E, I, O and U.
Word information (part-of-speech, conjugation type, and inflected form)

License

The label data is licensed with the CC-BY-SA 4.0, etc. See LICENSE.txt file for the detail.

Contributors

Tomoki Koriyama (Main contributor) (@hyama5)
Shinnosuke Takamichi

Acknowledgements

This work was supported by the following grants:

KAKENHI Grant Number 17K12711
The GAP foundation program of the University of Tokyo

Links

JSUT (Japanese speech corpus of Saruwatari-lab., University of Tokyo)
r9y9/just-lab ... provides automatically generated labels by using OpenJTalk.
HMM/DNN-based Speech Synthesis System (HTS) ... provides label format in the demo scripts.

Reference

Ryosuke Sonobe, Shinnosuke Takamichi, and Hiroshi Saruwatari, "JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis," arXiv preprint, 1711.00354, Sep. 2017.
Shinnosuke Takamichi, Ryosuke Sonobe, Kentaro Mitsui, Yuki Saito, Tomoki Koriyama, Naoko Tanji, Hiroshi Saruwatari, "JSUT and JVS: free Japanese voice corpora for accelerating speech synthesis research," Acoustical Science and Technology, Vol.xxx, No.xxx, pp.xxx-xxx, 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
e2e_symbol		e2e_symbol
labels/basic5000		labels/basic5000
text_kana		text_kana
CHANGELOG.md		CHANGELOG.md
LICENCE.txt		LICENCE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e_symbol

e2e_symbol

labels/basic5000

labels/basic5000

text_kana

text_kana

CHANGELOG.md

CHANGELOG.md

LICENCE.txt

LICENCE.txt

README.md

README.md

Repository files navigation

jsut-label

Notice

License

Contributors

Acknowledgements

Links

Reference

About

Releases

Packages

Contributors 3

License

sarulab-speech/jsut-label

Folders and files

Latest commit

History

Repository files navigation

jsut-label

Notice

License

Contributors

Acknowledgements

Links

Reference

About

Resources

License

Stars

Watchers

Forks