Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] The proper method for storing optimization result persistently. #590

Closed
jsbyysheng opened this issue May 2, 2024 · 6 comments
Assignees

Comments

@jsbyysheng
Copy link

Storing optimization result (with or without history) persistently is very significant for experiments and analysis. However, I cannot find a proper interface in pymoo to do this job currently. Here, I supply an idea based on zarr to store the optimization result locally. This code is only a draft and you can use it by pymoo_result_saver('path/to/save/res.zarr', res).save(history=True). I think it better and more convenient if persistent storage is supported by pymoo natively.

import zarr
import socket
import platform

class pymoo_result_saver:
    def __init__(self, fpath, pymoo_result, exclude=['problem'], other_attrs=None):
        self._pymoo_pop_attrs = ['CV', 'F', 'G', 'H', 'X', 'cv', 'f', 'feas', 'feasible']
        self._pymoo_result_attrs_default = ['CV', 'F', 'G', 'H', 'X', 'archive', 'cv', 'end_time', 'exec_time', 'f', 'feas', 'message', 'pf', 'start_time', 'success']
        self.pymoo_result = pymoo_result
        
        self.zarr_pymoo_result = zarr.open(fpath, mode='w')
        self.zarr_pymoo_result.attrs['system'] = platform.uname().system
        self.zarr_pymoo_result.attrs['release'] = platform.uname().release
        self.zarr_pymoo_result.attrs['os'] = platform.uname().node
        self.zarr_pymoo_result.attrs['os_version'] = platform.uname().version
        self.zarr_pymoo_result.attrs['arch'] = platform.uname().machine
        self.zarr_pymoo_result.attrs['machine'] = socket.gethostname()
        
    def _generate_dataset_name(self, parent_zarr_dataset, child_zarr_dataset):
        if parent_zarr_dataset == '/':
            dataset_name = f'{child_zarr_dataset}'
        else:
            if parent_zarr_dataset.endswith('/'):
                parent_zarr_dataset = parent_zarr_dataset[:-1]
            if child_zarr_dataset.startswith('/'):
                child_zarr_dataset = child_zarr_dataset[1:]
            dataset_name = f'{parent_zarr_dataset}/{child_zarr_dataset}'
        return dataset_name
        
    def _pymoo_zarr_append(self, dataset_name, value):
        if value is None:
            self.zarr_pymoo_result.array(dataset_name, np.array(np.NAN))
        elif isinstance(value, np.ndarray):
            self.zarr_pymoo_result.array(dataset_name, value)
        elif isinstance(value, (list, tuple)):
            self.zarr_pymoo_result.array(dataset_name, np.array(value))
        else:
            try:
                self.zarr_pymoo_result.array(dataset_name, np.array([value]))
            except Exception as e:
                self.zarr_pymoo_result.array(dataset_name, np.array(np.NAN))
        
    def _pymoo_zarr_dump_default(self):
        for key_name in self._pymoo_result_attrs_default:
            self._pymoo_zarr_append(key_name, getattr(self.pymoo_result, key_name))
            
    def _pymoo_zarr_dump_pop(self, pymoo_pop, parent_zarr_dataset='/', child_zarr_dataset='pop'):
        self._pymoo_zarr_dump_opt(pymoo_pop, parent_zarr_dataset=parent_zarr_dataset, child_zarr_dataset=child_zarr_dataset)
    
    def _pymoo_zarr_dump_opt(self, pymoo_opt, parent_zarr_dataset='/', child_zarr_dataset='opt'):
        if pymoo_opt is not None:
            dataset_name = self._generate_dataset_name(parent_zarr_dataset, child_zarr_dataset)
            for attr in self._pymoo_pop_attrs:
                self._pymoo_zarr_append(f'{dataset_name}/{attr}', pymoo_opt.get(attr))
    
    def _pymoo_zarr_dump_algorithm(self, pymoo_algo, parent_zarr_dataset='/', child_zarr_dataset='algorithm'):
        if pymoo_algo is not None:
            dataset_name = self._generate_dataset_name(parent_zarr_dataset, child_zarr_dataset)
            self._pymoo_zarr_append(f'{dataset_name}/n_gen', getattr(pymoo_algo, 'n_gen'))
            self._pymoo_zarr_append(f'{dataset_name}/n_offsprings', getattr(pymoo_algo, 'n_offsprings'))
            self._pymoo_zarr_append(f'{dataset_name}/n_eval', getattr(pymoo_algo.evaluator, 'n_eval'))
            self._pymoo_zarr_dump_opt(pymoo_algo.opt, parent_zarr_dataset=dataset_name)
    
    def _pymoo_zarr_dump_history(self, pymoo_hist, parent_zarr_dataset='/', child_zarr_dataset='history'):
        dataset_name = self._generate_dataset_name(parent_zarr_dataset, child_zarr_dataset)
        if pymoo_hist == []:
            self._pymoo_zarr_append(dataset_name, pymoo_hist)
        if pymoo_hist is not None:
            for idx, algo in enumerate(pymoo_hist):
                self._pymoo_zarr_dump_algorithm(algo, parent_zarr_dataset=f'{dataset_name}', child_zarr_dataset=f'{idx}')
        
    def save(self, whole=True, history=False):
        self._pymoo_zarr_dump_default()
        if whole:
            self._pymoo_zarr_dump_algorithm(self.pymoo_result.algorithm)
            self._pymoo_zarr_dump_opt(self.pymoo_result.opt)
            self._pymoo_zarr_dump_pop(self.pymoo_result.pop)
        if history:
            self._pymoo_zarr_dump_history(self.pymoo_result.history)
@blankjul blankjul self-assigned this May 12, 2024
@blankjul
Copy link
Collaborator

There are two ways already explained here in the documentation.

  1. Explained in the Getting started guide: https://pymoo.org/getting_started/index.html
  2. Using a callback: https://pymoo.org/interface/callback.html

Please let me know if this works for you.

@jsbyysheng
Copy link
Author

Of course, these work. I have read pymoo documents carefully enough. However, I think we need a more convenient and user-friendly way to save optimized results and historical processes. For example, I can simply save a matplotlib figure by fig.savefig('path/to/save').

@blankjul
Copy link
Collaborator

I think your solution might be a little overkill. You can simply use pickle or dill to serialize an entire object (the most common way in Python).
You can either serialize the algorithm object or algorithm.history if you have enabled the save_history option.

@jsbyysheng
Copy link
Author

I've already tried using pickle and dill, but they didn't work as expected. When I use either to serialize the algorithm object or algorithm.history, it incorporates instances of the MyProblem class into the pickle file, which causes errors upon loading. Perhaps using zarr might seem excessive, but it can restore the data in the algorithm object or algorithm.history as much as possible.

@blankjul
Copy link
Collaborator

You can modify your problem to become serializable (or use this trick within an algorithm).

https://stackoverflow.com/questions/2345944/exclude-objects-field-from-pickling-in-python

@jsbyysheng
Copy link
Author

jsbyysheng commented May 17, 2024

It's great to hear from you, and I appreciate your advice. However, my problem involves deforming a 3D volume and evaluating the results at each step. This is an implicit process involving multiple processes and GPU acceleration, making it challenging to serialize or simplify using tricks. Anyway, pymoo has proven to be an effective optimization tool for me after trying several others. Thank you so much for your excellent work and patient response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants