-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No longer able to align using phonemes directly as inputs #804
Comments
Note: this example is for Japanese, but I expect to do the same (feeding phonemes as input) in a few other latin script languages. I haven't checked yet if these are also affected by the same issue. |
Also, one thing I had to fix while debugging. I can open a separate bug if needed. In file tokenization/japanese.py, line 19: config_path = resource_dir.joinpath("japanese", "sudachi_config.json") This fails later because config_path is a pathlib object, which is not supported by sudachipy. It can be easily fixed by forcing a conversion to string. config_path = str(resource_dir.joinpath("japanese", "sudachi_config.json")) |
You can download the old 2.0 Japanese model via |
I see, thanks. Regardless from the tokenization issue, is there any other new feature, or improvement in quality I would be missing by using the old 2.0.1a model instead of the 3.0.0 one? Also, since the 3.0.0 model uses text + tokenization, is it trying to align with all possible pronunciations (as in different phonemes with different probabilities for a same word in a dict) and picking the best match, or rather using some criteria to pick the most likely pronunciation first and then attempt to align with it? |
I have been using
mfa align
to generate the alignments of audio with input IPA phonemes directly instead of text. This was done by using a handmade dictionary that simply maps IPA phonemes to themselves. The reason for this is that my use case forces me to do G2P separately in my own way, although ensuring that the produced phonemes are supported by the MFA acoustic model.However, after updating from version 2.x to 3.x (in particular, 3.0.7), I'm seeing that
mfa align
now attempts to do a text tokenization step that is modifying my input IPA phonemes and affecting the alignment results.Here's an example with Japanese text (好きにする):
(I got these tokenizer results by checking tokenization/japanese.py in the installed MFA package code while debugging the issue)
Is there any way to bypass the tokenizer and align using my input phonemes directly?
Log file
No log files were generated, since the problem does not manifest as a runtime error.
The text was updated successfully, but these errors were encountered: