Skip to content

vislupus/Bulgarian-TTS-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech Dataset

GitHub repo size GitHub forks GitHub Repo stars GitHub watchers

This dataset contains 3631 short audio clips on Bulgarian and 485 short audio clips on English. All of them are read by a single speaker. The format of the metadata is similar to that of LJ Speech so that the dataset is compatible with modern speech synthesis systems.

The texts for the dataset are sourced from Chitanka and are in the public domain. The audio clips are obtained from the LibriVox project and are also in the public domain. The texts are read by Georgi Stoychev and Euthymius.

The audio clips were split and transcribed with pydub and Whisper, respectively.

"Епопея на Забравените" is a literary work by Ivan Vazov, a renowned Bulgarian author. This epic poem takes listeners on a captivating journey through the forgotten realms of history and mythology. With a duration of 31 minutes and 6 seconds, this audio recording brings to life the rich narrative and lyrical beauty of Vazov's masterpiece.

"Старопланински легенди" is a collection of ten classic short stories written by Yordan Yovkov, one of Bulgaria's most celebrated writers. These tales explore themes of love, truth, goodness, morality, and beauty. Set against the backdrop of the majestic Balkan Mountains, this audio recording, spanning 4 hours, 7 minutes, and 20 seconds, draws listeners into a world of enchanting myths, legends, and fairy tales.

"Жетварят" is a captivating novel by Yordan Yovkov, one of Bulgaria's most celebrated writers. This profound work delves into the depths of human greed, hatred, and lust for power, while also exploring the transformative power of faith and the triumph of truth and goodness. With a duration of 5 hours, 32 minutes, and 57 seconds, this audio recording immerses listeners in Yovkov's masterful storytelling, weaving together golden threads of words that depict the human experience with vivid imagery and profound insight.

Important: The text should be checked for mistakes!