Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save intermediate populations to file to enable restarting #59

Open
jdossgollin opened this issue Dec 3, 2022 · 2 comments
Open

Save intermediate populations to file to enable restarting #59

jdossgollin opened this issue Dec 3, 2022 · 2 comments

Comments

@jdossgollin
Copy link

This is a feature request and I'm not sure whether you're interested in including it in your package

For optimizing heavy-ish functions, sometimes computer stuff happens. It would be great to have an easy API for saving intermediate populations to file so that if computation is interrupted, it's possible to pick up from about where it left off.

I think this should be pretty easy to do and I'm happy to try to help out with a first stab PR (may take a while, next few weeks are busy, and will probably need some additional work) but wanted to check in on what that API and implementation might look like.

API: this seems like it might go in Options(). I'm not sure what the keyword arguments should be named, but it seems like there should be an option to save results every N iterations (defaults to Inf/missing/similar that gives current behavior). Additionally, I think making the user define the filename to use for caching would make sense.

Implementation: Every N iterations, use JDL2 to save the result from optimize. optimize would need to check for the existence of an existing file -- to load in from file, an approach similar to that given in https://jmejia8.github.io/Metaheuristics.jl/stable/examples/#Providing-Initial-Solutions should work.

@jdossgollin
Copy link
Author

Any implementation of this should probably coordinate with #39

@jmejia8
Copy link
Owner

jmejia8 commented Dec 4, 2022

Hi! Sure any kind of contribution is appreciated.

An easy way for saving the current status of an optimization process could be great; however, restarting the optimization (after an unexpected stopping) may require some considerations:

  1. Some algorithms are auto-adapting their parameters, we need to save such adapted parameters.
  2. Should be necessary to save information on the optimization problem as well?
  3. How to restart the seed for the random number generator fixed by the user?

I would prefer to use BSON due to is lighter and faster than JDL2, as far as I remember.

Regarding your idea, I have two possible implementations for you:

1. Using Options:

method = ECA(options = Options(save_every = 10)) # to save every 10 generations/iterations
optimize(f, bounds, method)

A possible autogenerated file structure:

├── saving
│   ├── ECA
│   │   ├── data-04-dec-2022-at-09-47-56.bson
├── optimizing-heavy-ish-functions.jl

Where the BSON file would save a dictionary containing everything saved in method. Here, probably, we need to implement recursive translation struct to Dict.

To restart, the API would check for the existence of an existing file at saving/ECA/ without modifying the code in optimizing-heavy-ish-functions.jl.


2. Wrapping the optimizers:

The idea here is to define a wrapper BackUp(method) that saves everything in method. For example

solver = BackUp(ECA(), every = 10, at = "my-back-up-file.bson")
optimize(f, bounds, solver)

Here, the API would check for the existence of an existing file at saving/my-data.bson to try to restart the optimization process. However, if the file does not exist, then initiate ECA as usual.

Names for the possible wrapper: "BackUp", "Save", "StoreTraces", ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants