Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when using voice converse example #298

Open
xinkez opened this issue Sep 21, 2020 · 9 comments
Open

Errors when using voice converse example #298

xinkez opened this issue Sep 21, 2020 · 9 comments
Labels
bug Something isn't working

Comments

@xinkez
Copy link

xinkez commented Sep 21, 2020

Hi,

when I ran the script of examples/vc/vcc2018/run.sh, I got such errors below. Thank you in advance.

[
Traceback (most recent call last):
File "athena/stargan_main.py", line 179, in
train(json_file, GanSolver, 1, 0)
File "athena/stargan_main.py", line 125, in train
p, model, checkpointer = build_model_from_jsonfile_stargan(jsonfile)
File "athena/stargan_main.py", line 105, in build_model_from_jsonfile_stargan
model_name="gan"
File "/backup/Algorithm/xkzhang/codes/athena/athena/utils/checkpoint.py", line 45, in init
super().init(**kwargs, model=model)
File "/8T_raid/xkzhang/venv_athena/lib/python3.5/site-packages/tensorflow_core/python/training/tracking/util.py", line 1779, in init
% (v,))
ValueError: Checkpoint was expecting a trackable object (an object derived from TrackableBase), got gan. If you believe this object should be trackable (i.e. it is part of the TensorFlow Python API and manages state), please open an issue.
]

@xiaochunxin
Copy link

The bug was caused by File "athena/stargan_main.py", line 105, in build_model_from_jsonfile_stargan
model_name="gan".
just delete the model_name="gan", and you can keep run the examples/vc/vcc2018/run.sh

I'm sorry for the trouble caused to you due to our negligence in work.

@xinkez
Copy link
Author

xinkez commented Sep 21, 2020

Thank you. Now it works. Also two more questions, as I know there are 12 speakers in the vcc 2018 training dataset, why do you set the parameter <speaker_num> to 9 in the file of <stargan_voice_conversion.json>? and how do you split vcc 2018 dataset into , and ?

@xiaochunxin
Copy link

I fixed the bug.Just take a look at my updated code.

@xinkez
Copy link
Author

xinkez commented Sep 22, 2020

Thank you. I run our your updated codes. Now the training is good, but when it come to dev stage, it output the errors in the following. Should the number of speakers in the train set be same with that of dev set? and how do you split vcc 2018 dataset into train, dev and test?

INFO:absl:>>>>> start evaluate in epoch 0
INFO:absl:hparams: [('cls', <class 'athena.data.datasets.voice_conversion.VoiceConversionDatasetBuilder'>), ('cmvn_file', 'examples/vc/vcc2018/data/cmvn'), ('codedsp_dim', 36), ('data_csv', 'examples/vc/vcc2018/data_numpy/dev.csv'), ('enable_load_from_disk', True), ('fft_size', 1024), ('fs', 16000), ('input_length_range', [10, 8000]), ('num_cmvn_workers', 1)]
INFO:absl:Successfully load cmvn file examples/vc/vcc2018/data/cmvn
INFO:absl:Loading data from examples/vc/vcc2018/data_numpy/dev.csv
INFO:absl:please be patient, enable tf.function, it takes time ...
Traceback (most recent call last):
File "athena/stargan_main.py", line 179, in
train(json_file, GanSolver, 1, 0)
File "athena/stargan_main.py", line 158, in train
loss_g, metrics_g = solver.evaluate(devset, epoch)
File "athena/athena/solver.py", line 436, in evaluate
total_loss, metrics = evaluate_step(samples)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in call
result = self._call(*args, **kwds)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/def_function.py", line 524, in _call
*args, **kwds)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py", line 1650, in canonicalize_function_inputs
self._flat_input_signature)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py", line 1716, in _convert_inputs_to_signature
format_error_message(inputs, input_signature))
ValueError: Python inputs incompatible with input_signature:

@xiaochunxin
Copy link

The number of speakers in the train set must be same with that of dev set.

@xinkez
Copy link
Author

xinkez commented Sep 23, 2020

@xiaochunxin With the training going on, the "loss", "metrics_d" and "metrics_g" will become nan, and these parameters cannot return to normal state.
image

@xiaochunxin
Copy link

Are you using the corpus of VCC2018? If so, you can check out the patch in this pr( https://github.com/athena-team/athena/pull/302/files ). The optimizer parameters used before will lead to high learning rate in the training process, thus causing loss=NAN in some cases because the training of GAN is highly unstable.

@xinkez
Copy link
Author

xinkez commented Sep 24, 2020

Yes, I'm using the corpus of VCC2018.

First I used all the 12 speakers to train the model, the loss is normal, but the generated wavs after training is not so good. So I tried to select only 2 speakers to train this model, the loss became nan as I mentioned before. Yesterday I checked out your patch in the pr (https://github.com/athena-team/athena/pull/302/files ), but it still became nan during training.

The two speakers I choose are VCC2SF1 and VCC2TF1.

@stale
Copy link

stale bot commented Sep 30, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 30, 2020
@Some-random Some-random added the bug Something isn't working label Oct 1, 2020
@stale stale bot removed the stale label Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants