SEAME-dev-set

For performance evaluation, we extract two subsets of the SEAME data as test sets: one is dominated by Mandarin speech (denoted as dev_man) while the other (denoted as dev_sge) is dominated by Singapore English. Each test set contains 10 speakers with balanced genders.

	Speakers	hours
train	134	101.13
dev_man	10	7.49
dev_sge	10	3.93

We only shared the train wav file list which you can see in LDC2015S04. Please contact me if you have any questions (zengzp0912@gmail.com).

References

[1] Dau-Cheng Lyu, Tien Ping Tan, Eng siong Chng, and Hai zhou Li,“SEAME:a mandarin-english code-switching speech corpus in south-east asia.,” in INTERSPEECH, 2010, vol. 10, pp. 1986–1989.

[2] Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng, and Haizhou Li, “On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition,” arXiv preprint arXiv:1811.00241, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dev_man		dev_man
dev_sge		dev_sge
train		train
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev_man

dev_man

dev_sge

dev_sge

train

train

README.md

README.md

Repository files navigation

SEAME-dev-set

References

About

Releases

Packages

zengzp0912/SEAME-dev-set

Folders and files

Latest commit

History

Repository files navigation

SEAME-dev-set

References

About

Resources

Stars

Watchers

Forks