Skip to content

Nexdata-AI/310-Hours-Turkish-Scripted-Monologue-Smartphone-Speech-Dataset

Repository files navigation

310-Hours-Turkish-Scripted-Monologue-Smartphone-Speech-Dataset

Description

Turkish Scripted Monologue Smartphone Speech Dataset, collected from monologue based on given scripts. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(223 people in total, from turkey), geographicly speaking, enhancing model performance in real and complex tasks.rnQuality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

For more details, please refer to the link: https://www.nexdata.ai/datasets/1446?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel.

Recording condition

quiet indoor environment, low background noise, without echo;

Recording device

Android smartphone, iPhone;

Speaker

223 native speakers in total, 54% male and 46% female;

Country

Turkey(TUR);

Language(Region) Code

tr-TR;

Language

Turkish;

Features of annotation

Transcription text;

Accuracy Rate

Word Accuracy Rate (WAR) 95%;

Licensing Information

Commercial License

Releases

No releases published

Packages

No packages published