Resuming training is unintuitive #147

free-mana · 2022-09-21T19:38:55Z

Describe the bug
The process to resume training a previously trained model is unintuitive.

To Reproduce
Steps to reproduce the behavior:

Package Versions: torchgan 0.1.0, torchvision 0.13.1, pytorch 1.12.1

Logging Configurations:

print(torchgan.logging.backends.CONSOLE_LOGGING)
1
print(torchgan.logging.backends.VISDOM_LOGGING)
0
print(torchgan.logging.backends.TENSORBOARD_LOGGING)
0

Minimal Working Example for the error
The issue can be easily encountered by slightly modifying Tutorial 1. Follow the tutorial normally, until you reach the "Visualizing the Samples" section. Before that section, add the following code cell:
```
trainer.load_model("./model/gan4.model")
trainer(dataloader)
```
Now execute the new cell.

Expected behavior
This should have continued training the model for an additional 10 generations on CUDA, or 5 generations otherwise. However, because the Trainer's epochs parameter represents the total number of epochs to train, the function returns without any further training done. In order to achieve the expected behaviour, the user must instead create a new Trainer object and pass (current_epochs + desired_additional_epochs) as the value of the epochs parameter. This is unintuitive, and requires the user to manually keep track of how many epochs have been completed when they end a training session if they plan on continuing it later.

Desktop (please complete the following information):

OS: Windows 10 Pro, version 21H2

Installation

Pip

Additional context
Fixing this would involve rewriting significant portions of the BaseTrainer class. I would suggest allowing the user to pass in a number of epochs to train in the __call__() function, rather than having it set at object creation.

The text was updated successfully, but these errors were encountered:

free-mana added the bug Something isn't working label Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resuming training is unintuitive #147

Resuming training is unintuitive #147

free-mana commented Sep 21, 2022

Resuming training is unintuitive #147

Resuming training is unintuitive #147

Comments

free-mana commented Sep 21, 2022