[FEATURE]when saving multiple epochs add an epoch number suffix for when save best=False #597

Quetzalcohuatl · 2024-01-31T00:58:05Z

🚀 Feature

Saves multiple .pth on each checkpoint. Instead of overwriting every checkpoint.pth

Motivation

Often useful to see how model performs at each epoch/savepoint. For example when training llm, want to measure the generative capabilities after each epoch and see if it is improving

Quetzalcohuatl · 2024-01-31T01:00:35Z

Example: after epoch 1 it saves checkpoint_ep01.pth

after epoch 2 it saves checkpoint_ep02.pth

when loading mode back in according to config, it by default will load in sorted(glob(“checkpoint_ep*”))[-1] aka the last epoch to keep the behavior the same as it currently is

alternatively if save_best_only=true, then keep the current behavior of saving as checkpoint.pth ?

psinger · 2024-01-31T09:15:49Z

We didnt do that by default as model weights take a ton of disk space.

We could theoretically make it a separate setting to additionally save all checkpoints, wdyt?

Quetzalcohuatl · 2024-01-31T10:45:55Z

We didnt do that by default as model weights take a ton of disk space.

We could theoretically make it a separate setting to additionally save all checkpoints, wdyt?

Most research papers are only training for 1 epoch, sometimes 2. If the user knows what theyre doing and wants to enable it, I think its a nice option. Especially since its a simple implementation.

Quetzalcohuatl added the type/feature Feature request label Jan 31, 2024

pascal-pfeiffer self-assigned this May 23, 2024

pascal-pfeiffer mentioned this issue May 23, 2024

Save each evaluation epoch #721

Merged

pascal-pfeiffer closed this as completed in #721 May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]when saving multiple epochs add an epoch number suffix for when save best=False #597

[FEATURE]when saving multiple epochs add an epoch number suffix for when save best=False #597

Quetzalcohuatl commented Jan 31, 2024

Quetzalcohuatl commented Jan 31, 2024

psinger commented Jan 31, 2024 •

edited

Quetzalcohuatl commented Jan 31, 2024

[FEATURE]when saving multiple epochs add an epoch number suffix for when save best=False #597

[FEATURE]when saving multiple epochs add an epoch number suffix for when save best=False #597

Comments

Quetzalcohuatl commented Jan 31, 2024

🚀 Feature

Motivation

Quetzalcohuatl commented Jan 31, 2024

psinger commented Jan 31, 2024 • edited

Quetzalcohuatl commented Jan 31, 2024

psinger commented Jan 31, 2024 •

edited