127-Hours-Brazilian-Portuguese-Conversational-Speech-Data-by-Mobile-Phone

Description

The 127 Hours - Brazilian Portuguese Conversational Speech Data involved 142 native speakers, developed with proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones. The audio format is 16kHz, 16bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification.

For more details, please refer to the link: https://www.nexdata.ai/datasets/1209?source=Github

Specifications

Format

16kHz, 16bit, uncompressed wav, mono channel;

Recording Environment

quiet indoor environment, without echo;

Recording content

dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;

Demographics

142 speakers totally, with 50% males and 50% females.

Annotation

annotating for the transcription text, speaker identification gender and noise symbols;

Device

Android mobile phone, iPhone;

Application scenarios

speech recognition; voiceprint recognition;

Accuracy rate

the word accuracy rate is not less than 98%

Licensing Information

Commercial License

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
foo_301_00130_16k-16.txt		foo_301_00130_16k-16.txt
foo_301_00130_16k-16.wav		foo_301_00130_16k-16.wav
foo_301_00130_16k-17.txt		foo_301_00130_16k-17.txt
foo_301_00130_16k-17.wav		foo_301_00130_16k-17.wav
foo_301_00130_16k-18.txt		foo_301_00130_16k-18.txt
foo_301_00130_16k-18.wav		foo_301_00130_16k-18.wav
foo_301_00130_16k-19.txt		foo_301_00130_16k-19.txt
foo_301_00130_16k-19.wav		foo_301_00130_16k-19.wav
foo_301_00130_16k-20.txt		foo_301_00130_16k-20.txt
foo_301_00130_16k-20.wav		foo_301_00130_16k-20.wav
foo_301_00130_16k-21.txt		foo_301_00130_16k-21.txt
foo_301_00130_16k-21.wav		foo_301_00130_16k-21.wav

Nexdata-AI/127-Hours-Brazilian-Portuguese-Conversational-Speech-Data-by-Mobile-Phone

Folders and files

Latest commit

History

Repository files navigation

127-Hours-Brazilian-Portuguese-Conversational-Speech-Data-by-Mobile-Phone

Description

Specifications

Format

Recording Environment

Recording content

Demographics

Annotation

Device

Application scenarios

Accuracy rate

Licensing Information

About

Topics

Resources

Stars

Watchers

Forks