Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, I have issue as I try to use another english dataset. And I'm wondering why Inference from packed test set can work (CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/midi/e2e/opencpop/ds100_adj_rel.yaml --exp_name $MY_DS_EXP_NAME --reset --infer) but inference model from raw input (python inference/svs/ds_e2e.py --config usr/configs/midi/e2e/opencpop/ds100_adj_rel.yaml --exp_name $MY_DS_EXP_NAME) needs same phoneme set size? #74

Open
michaellin99999 opened this issue Oct 3, 2022 · 14 comments

Comments

@michaellin99999
Copy link

    Hello, I have issue as I try to use another english dataset. And I'm wondering why Inference from packed test set can work (`CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/midi/e2e/opencpop/ds100_adj_rel.yaml --exp_name $MY_DS_EXP_NAME --reset --infer`) but inference model from raw input (`python inference/svs/ds_e2e.py --config usr/configs/midi/e2e/opencpop/ds100_adj_rel.yaml --exp_name $MY_DS_EXP_NAME`) needs same phoneme set size?

Originally posted by @Wayne-wonderai in #29 (comment)

@michaellin99999
Copy link
Author

same issue

@MrZixi
Copy link
Collaborator

MrZixi commented Oct 9, 2022

When using our configs on your dataset, Please do check the "binary_data_dir" in hparams to make sure it points to your binarized data directory because the phoneme dictionary text file will decide the dimension of phone_encoder in the model.

@michaellin99999
Copy link
Author

so, by pointing to our own binarized data in "binary_data_dir" this should change the dimension of phone_encoder to fit our model?

@michaellin99999
Copy link
Author

we get this issue
Screenshot from 2022-10-04 17-40-29

@michaellin99999
Copy link
Author

When using our configs on your dataset, Please do check the "binary_data_dir" in hparams to make sure it points to your binarized data directory because the phoneme dictionary text file will decide the dimension of phone_encoder in the model.

I get this issue,
Screenshot from 2022-10-04 17-40-29

@MrZixi
Copy link
Collaborator

MrZixi commented Oct 9, 2022

Sorry I may have misunderstood your issue. If you want to infer from our pretrained ckpt, please make sure your phoneme dictionary is exactly the same as ours because some layers in the pretrained ckpt are related to this. Or the phoneme unit may be wrongly encoded due to different dictionaries.

@MrZixi
Copy link
Collaborator

MrZixi commented Oct 9, 2022

If you want to use customed phoneme dictionary, please follow our guidance and re-run the training.

@michaellin99999
Copy link
Author

If you want to use customed phoneme dictionary, please follow our guidance and re-run the training.

we did that but ran into the issue above. We retrained FFT, and Diffsinger and whenwe try to put in sequence, the error above is shown. Can you point us to where the model is written so we can debug what is causing this issue? we cant pinpoint what is requiring the missing keys.

@michaellin99999
Copy link
Author

michaellin99999 commented Oct 9, 2022

If you want to use customed phoneme dictionary, please follow our guidance and re-run the training.

我們是依照這個教學 (https://github.com/MoonInTheRiver/DiffSinger/blob/master/docs/README-SVS.md) 用英文資料集重新訓練, 但是當將FFT 跟Diffsinger 接起來時 會報上面這個錯誤
Screenshot from 2022-10-04 17-40-29
. 我們找不到是哪隻程式 會吃這些state_dict 的key. 您可以將我們指向是哪一行程式嗎. 另外, Diffsinger model 每個 layer 是寫在哪一個程式裡? 我們也找不到

@michaellin99999
Copy link
Author

If you want to use customed phoneme dictionary, please follow our guidance and re-run the training.

when we retrain (using different phoneme dimension) and don't care about the phoneme, the validation script can be used to create singing voice that resemble the new data. but the inference script doesnt work.

@MrZixi
Copy link
Collaborator

MrZixi commented Oct 10, 2022

They are in the modules/***.

@michaellin99999
Copy link
Author

They are in the modules/***.
125AEE75-AB9E-4197-93AC-F15FABDC3B50

Where does the run.py file get the list of modules to load?

C8860D5C-7BA6-40B6-9AC3-B5D573EAF527

@michaellin99999
Copy link
Author

michaellin99999 commented Oct 10, 2022

They are in the modules/***.

Thank you last question, Which code is responsible for checking the model size and parameters? that gives the errror in loading state_dict for fastspeech2MIDI “ “missing keys in state_dict” and do we ignore that if training own model?
FA808808-FD74-4671-BA64-1E20F61FF1EF

@li-henan
Copy link

They are in the modules/***.

Thank you last question, Which code is responsible for checking the model size and parameters? that gives the errror in loading state_dict for fastspeech2MIDI “ “missing keys in state_dict” and do we ignore that if training own model? FA808808-FD74-4671-BA64-1E20F61FF1EF

您好,请问下呗,您使用diffsinger在英文数据集成功训练模型了嘛,感谢🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants