189-Hours-Latin-American-Spanish-Children-Spontaneous-Speech-Data

Description

The 189 Hours - Latin American Spanish Child's Spontaneous Speech Data is a collection of speech clips, the content covering multiple topics. All the speech audio was manually transcribed into text content; speaker identity, gender, and other attribution are also annotated. This dataset can be used for voiceprint recognition model training, corpus construction for machine translation, and algorithm research introduction

For more details, please refer to the link: https://www.nexdata.ai/datasets/1250?source=Github

Specifications

Format

16kHz, 16bit, mono channel;

age

children aged 12 and under

Content category

including interview, self-meida,variety show, etc.

Language

Latin American Spanish;

Annotation

annotation for the transcription text, speaker identification, gender;

Application scenarios

speech recognition, video caption generation and video content review;

Accuracy

at a Word Accuracy Rate (SAR) of being no less than 98%.

Licensing Information

Commercial License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
000170_1_1.txt		000170_1_1.txt
000170_1_1.wav		000170_1_1.wav
000170_1_10.txt		000170_1_10.txt
000170_1_10.wav		000170_1_10.wav
000170_1_11.txt		000170_1_11.txt
000170_1_11.wav		000170_1_11.wav
000170_1_2.txt		000170_1_2.txt
000170_1_2.wav		000170_1_2.wav
000170_1_3.txt		000170_1_3.txt
000170_1_3.wav		000170_1_3.wav
000170_1_4.txt		000170_1_4.txt
000170_1_4.wav		000170_1_4.wav
000170_1_5.txt		000170_1_5.txt
000170_1_5.wav		000170_1_5.wav
000170_1_6.txt		000170_1_6.txt
000170_1_6.wav		000170_1_6.wav
000170_1_7.txt		000170_1_7.txt
000170_1_7.wav		000170_1_7.wav
000170_1_8.txt		000170_1_8.txt
000170_1_8.wav		000170_1_8.wav
000170_1_9.txt		000170_1_9.txt
000170_1_9.wav		000170_1_9.wav
000172_1_1.txt		000172_1_1.txt
000172_1_1.wav		000172_1_1.wav
000172_1_10.txt		000172_1_10.txt
000172_1_10.wav		000172_1_10.wav
000172_1_11.txt		000172_1_11.txt
000172_1_11.wav		000172_1_11.wav
000172_1_2.txt		000172_1_2.txt
000172_1_2.wav		000172_1_2.wav
000172_1_3.txt		000172_1_3.txt
000172_1_3.wav		000172_1_3.wav
000172_1_4.txt		000172_1_4.txt
000172_1_4.wav		000172_1_4.wav
000172_1_5.txt		000172_1_5.txt
000172_1_5.wav		000172_1_5.wav
000172_1_6.txt		000172_1_6.txt
000172_1_6.wav		000172_1_6.wav
000172_1_7.txt		000172_1_7.txt
000172_1_7.wav		000172_1_7.wav
000172_1_8.txt		000172_1_8.txt
000172_1_8.wav		000172_1_8.wav
000172_1_9.txt		000172_1_9.txt
000172_1_9.wav		000172_1_9.wav
README.md		README.md

Nexdata-AI/189-Hours-Latin-American-Spanish-Children-Spontaneous-Speech-Data

Folders and files

Latest commit

History

Repository files navigation

189-Hours-Latin-American-Spanish-Children-Spontaneous-Speech-Data

Description

Specifications

Format

age

Content category

Language

Annotation

Application scenarios

Accuracy

Licensing Information

About

Topics

Resources

Stars

Watchers

Forks