Skip to content

sarulab-speech/jsut-label

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jsut-label

HTS-style context labels of JSUT corpus available for speech synthesis system such as HTS, Merlin, and nnmnkwii. Phonetic and prosodic information are based on manual annotation. Time information are automatically estimated using Julius. Currently, this repository provides the labels of the BASIC5000 subset. Also, the pronounced texts and kanas are listed in ./text_kana. Input sequences available for end-to-end speech synthesis are provided in ./e2e_symbol.

Notice

The context labels are not completely the same format as those created using OpenJTalk. The followings are NOT supported in the labels.

  • Unvoiced vowels, which are genrally annotated as A, E, I, O and U.
  • Word information (part-of-speech, conjugation type, and inflected form)

License

The label data is licensed with the CC-BY-SA 4.0, etc. See LICENSE.txt file for the detail.

Contributors

Acknowledgements

This work was supported by the following grants:

Links

Reference

  • Ryosuke Sonobe, Shinnosuke Takamichi, and Hiroshi Saruwatari, "JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis," arXiv preprint, 1711.00354, Sep. 2017.
  • Shinnosuke Takamichi, Ryosuke Sonobe, Kentaro Mitsui, Yuki Saito, Tomoki Koriyama, Naoko Tanji, Hiroshi Saruwatari, "JSUT and JVS: free Japanese voice corpora for accelerating speech synthesis research," Acoustical Science and Technology, Vol.xxx, No.xxx, pp.xxx-xxx, 2020.

About

context labels and pronunciation data for JSUT corpus

Resources

License

Stars

Watchers

Forks

Packages

No packages published