Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError when resuming from previous training checkpoint #111

Open
jreus opened this issue Aug 16, 2022 · 1 comment
Open

ValueError when resuming from previous training checkpoint #111

jreus opened this issue Aug 16, 2022 · 1 comment
Labels
previous version concerns a previous version of RAVE ; not high-priority

Comments

@jreus
Copy link

jreus commented Aug 16, 2022

Hey RAVE team, I'm repeatedly getting a similar error whenever attempting to resume a training job that was cancelled after 24 hours on a remote training server. The error happens on the "validation_epoch_end" hook and is ValueError: not enough values to unpack (expected 2, got 0) (see full stacktrace below).

Epoch 24:   0%|          | 0/19333 [00:00<00:00, -25206153.85it/s] 
Traceback (most recent call last):
  File "/jmain02/home/RAVE/train_rave.py", line 175, in <module>
    trainer.fit(model, train, val, ckpt_path=run)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
    self._call_and_handle_interrupt(
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
    results = self._run_stage()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
    return self._run_train()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1353, in _run_train
    self.fit_loop.run()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 205, in run
    self.on_advance_end()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 255, in on_advance_end
    self._run_validation()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 309, in _run_validation
    self.val_loop.run()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 211, in run
    output = self.on_run_end()
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 188, in on_run_end
    self._evaluation_epoch_end(self._outputs)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 315, in _evaluation_epoch_end
    self.trainer._call_lightning_module_hook("validation_epoch_end", output_or_outputs)
  File "/jmain02/home/.conda/envs/rave/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1595, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/jmain02/home/RAVE/rave/model.py", line 708, in validation_epoch_end
    audio, z = list(zip(*out))
ValueError: not enough values to unpack (expected 2, got 0)
@domkirke
Copy link
Collaborator

What version was your RAVE? Did you try with 2.3?

@domkirke domkirke added the previous version concerns a previous version of RAVE ; not high-priority label Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
previous version concerns a previous version of RAVE ; not high-priority
Projects
None yet
Development

No branches or pull requests

2 participants