Errors when using voice converse example #298

xinkez · 2020-09-21T10:53:10Z

Hi,

when I ran the script of examples/vc/vcc2018/run.sh, I got such errors below. Thank you in advance.

[
Traceback (most recent call last):
File "athena/stargan_main.py", line 179, in
train(json_file, GanSolver, 1, 0)
File "athena/stargan_main.py", line 125, in train
p, model, checkpointer = build_model_from_jsonfile_stargan(jsonfile)
File "athena/stargan_main.py", line 105, in build_model_from_jsonfile_stargan
model_name="gan"
File "/backup/Algorithm/xkzhang/codes/athena/athena/utils/checkpoint.py", line 45, in init
super().init(**kwargs, model=model)
File "/8T_raid/xkzhang/venv_athena/lib/python3.5/site-packages/tensorflow_core/python/training/tracking/util.py", line 1779, in init
% (v,))
ValueError: Checkpoint was expecting a trackable object (an object derived from TrackableBase), got gan. If you believe this object should be trackable (i.e. it is part of the TensorFlow Python API and manages state), please open an issue.
]

The text was updated successfully, but these errors were encountered:

xiaochunxin · 2020-09-21T14:08:17Z

The bug was caused by File "athena/stargan_main.py", line 105, in build_model_from_jsonfile_stargan
model_name="gan".
just delete the model_name="gan", and you can keep run the examples/vc/vcc2018/run.sh

I'm sorry for the trouble caused to you due to our negligence in work.

xinkez · 2020-09-21T14:50:36Z

Thank you. Now it works. Also two more questions, as I know there are 12 speakers in the vcc 2018 training dataset, why do you set the parameter <speaker_num> to 9 in the file of <stargan_voice_conversion.json>? and how do you split vcc 2018 dataset into , and ?

xiaochunxin · 2020-09-22T03:27:14Z

I fixed the bug.Just take a look at my updated code.

xinkez · 2020-09-22T05:16:22Z

Thank you. I run our your updated codes. Now the training is good, but when it come to dev stage, it output the errors in the following. Should the number of speakers in the train set be same with that of dev set? and how do you split vcc 2018 dataset into train, dev and test?

INFO:absl:>>>>> start evaluate in epoch 0
INFO:absl:hparams: [('cls', <class 'athena.data.datasets.voice_conversion.VoiceConversionDatasetBuilder'>), ('cmvn_file', 'examples/vc/vcc2018/data/cmvn'), ('codedsp_dim', 36), ('data_csv', 'examples/vc/vcc2018/data_numpy/dev.csv'), ('enable_load_from_disk', True), ('fft_size', 1024), ('fs', 16000), ('input_length_range', [10, 8000]), ('num_cmvn_workers', 1)]
INFO:absl:Successfully load cmvn file examples/vc/vcc2018/data/cmvn
INFO:absl:Loading data from examples/vc/vcc2018/data_numpy/dev.csv
INFO:absl:please be patient, enable tf.function, it takes time ...
Traceback (most recent call last):
File "athena/stargan_main.py", line 179, in
train(json_file, GanSolver, 1, 0)
File "athena/stargan_main.py", line 158, in train
loss_g, metrics_g = solver.evaluate(devset, epoch)
File "athena/athena/solver.py", line 436, in evaluate
total_loss, metrics = evaluate_step(samples)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in call
result = self._call(*args, **kwds)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/def_function.py", line 524, in _call
*args, **kwds)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py", line 1650, in canonicalize_function_inputs
self._flat_input_signature)
File "venv_athena/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py", line 1716, in _convert_inputs_to_signature
format_error_message(inputs, input_signature))
ValueError: Python inputs incompatible with input_signature:

xiaochunxin · 2020-09-22T05:41:26Z

The number of speakers in the train set must be same with that of dev set.

xinkez · 2020-09-23T10:49:58Z

@xiaochunxin With the training going on, the "loss", "metrics_d" and "metrics_g" will become nan, and these parameters cannot return to normal state.

xiaochunxin · 2020-09-24T09:49:05Z

Are you using the corpus of VCC2018? If so, you can check out the patch in this pr( https://github.com/athena-team/athena/pull/302/files ). The optimizer parameters used before will lead to high learning rate in the training process, thus causing loss=NAN in some cases because the training of GAN is highly unstable.

xinkez · 2020-09-24T23:59:38Z

Yes, I'm using the corpus of VCC2018.

First I used all the 12 speakers to train the model, the loss is normal, but the generated wavs after training is not so good. So I tried to select only 2 speakers to train this model, the loss became nan as I mentioned before. Yesterday I checked out your patch in the pr (https://github.com/athena-team/athena/pull/302/files ), but it still became nan during training.

The two speakers I choose are VCC2SF1 and VCC2TF1.

stale · 2020-09-30T02:04:29Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale bot added the stale label Sep 30, 2020

Some-random added the bug Something isn't working label Oct 1, 2020

stale bot removed the stale label Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors when using voice converse example #298

Errors when using voice converse example #298

xinkez commented Sep 21, 2020

xiaochunxin commented Sep 21, 2020

xinkez commented Sep 21, 2020

xiaochunxin commented Sep 22, 2020

xinkez commented Sep 22, 2020

xiaochunxin commented Sep 22, 2020

xinkez commented Sep 23, 2020

xiaochunxin commented Sep 24, 2020

xinkez commented Sep 24, 2020 •

edited

stale bot commented Sep 30, 2020

Errors when using voice converse example #298

Errors when using voice converse example #298

Comments

xinkez commented Sep 21, 2020

xiaochunxin commented Sep 21, 2020

xinkez commented Sep 21, 2020

xiaochunxin commented Sep 22, 2020

xinkez commented Sep 22, 2020

xiaochunxin commented Sep 22, 2020

xinkez commented Sep 23, 2020

xiaochunxin commented Sep 24, 2020

xinkez commented Sep 24, 2020 • edited

stale bot commented Sep 30, 2020

xinkez commented Sep 24, 2020 •

edited