Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a new language, but getting errors in preprocessing step #465

Open
isolveit-aps opened this issue Apr 11, 2024 · 1 comment
Open

Comments

@isolveit-aps
Copy link

I am running into similar issues as mentioned in issue #49 and issue #316 with the error: "Failed to set eSpeak-ng voice"

Anyways, I'm training a Faroese voice (https://en.wikipedia.org/wiki/Faroese_language), and I managed to manually add Faroese to the espeak-ng local installation, and it now speaks in Faroese, and is able to produce phonemes for Faroese text:

$ espeak-ng -v fo "Hey Andras, hvussu gongur hjá tær í dag?" --ipa
hˈeɪː ˈandəras
kʋˈyssy ɡˈoʊŋɡyr çaʊ taɪr ˈui dˈaːx

However in Piper, I get these errors for the processing:

$ python3 -m piper_train.preprocess   --language fo   --input-dir /home/andras/piper/datasets/andras   --output-dir /home/andras/piper/fo_outpu1   --dataset-format ljspeech   --single-speaker   --sample-rate 22050
INFO:preprocess:Single speaker dataset
INFO:preprocess:Wrote dataset config
INFO:preprocess:Processing 260 utterance(s) with 20 worker(s)
ERROR:preprocess:Failed to process utterance: Utterance(text='“Jú, ein afturat gongur nokk. Ger so væl, góði” segði omman.', audio_path=PosixPath('/home/andras/piper/datasets/andras/wavs/0000000006.wav'), speaker=None, speaker_id=None, phonemes=None, phoneme_ids=None, audio_norm_path=None, audio_spec_path=None, missing_phonemes=Counter())
Traceback (most recent call last):
  File "/home/andras/piper/src/python/piper_train/preprocess.py", line 302, in phonemize_batch_espeak
    all_phonemes = phonemize_espeak(casing(utt.text), args.language)
  File "/home/andras/piper/src/python/.venv/lib/python3.10/site-packages/piper_phonemize/__init__.py", line 38, in phonemize_espeak
    return _phonemize_espeak(text, voice, str(data_path))
RuntimeError: Failed to set eSpeak-ng voice

I bet this has something to do with the interplay between piper and espeak-ng, but I haven't been able to figure it out. The language code is fo : https://en.wikipedia.org/wiki/Faroese_language

A bit of background, if interested :)

I am currently trying to add my own voice and recordings of less than 1 hour of data, in order to test piper for faroese, but there is also an open source dataset of 100 hours of speech, across 433 speakers, so there is quite a lot of data available, if I can get the training to work. The weakness of those datasets is perhaps that no individual speaker has much more than ½ hour of recordings. https://mtd.setur.fo/en/resource/ravnur-blark-1-0/

There is also a smaller, and differently structured dataset, based on the same recordings, that was used to train the ASR model used in the VoisIT app (Android/App Store - which is my personal project), and that dataset is available here: https://repository.clarin.is/repository/xmlui/handle/20.500.12537/276
Trained ASR model and other models are also available: https://huggingface.co/carlosdanielhernandezmena?search_models=faroese
(Credits to these two teams above, for Faroese gathering of data and model training "Ravnur" and "Ravnursson").

@atabekm
Copy link

atabekm commented Apr 25, 2024

I had the same problem. As we can see from the stack trace, the issue is with piper_phonemize package. If you browse to /home/andras/piper/src/python/.venv/lib/python3.10/site-packages/piper_phonemize/ folder, you can see espeaker-ng-data folder, which I suppose is based on espeaker-ng project in github. It doesn't contain the custom language you locally added, so you can copy espeak-ng-data folder from your local version of espeak-ng to /home/andras/piper/src/python/.venv/lib/python3.10/site-packages/piper_phonemize/. This should solve this issue, at least it did for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants