Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG or Rather Improvement?] #392

Open
lunjohnzhang opened this issue Oct 5, 2023 · 3 comments
Open

[BUG or Rather Improvement?] #392

lunjohnzhang opened this issue Oct 5, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@lunjohnzhang
Copy link
Contributor

  • pyribs version: 0.5.0
  • Python version: 3.8.18
  • Operating System: Ubuntu 20.04

Description

I am trying to use EvolutionStrategyEmitter with CMAEvolutionStrategy to generate high dimensional solutions (i.e. > 3000) where all dimensions are within the bounds [1e-3, None] (i.e. larger than a small positive number). However, since the current implementation of CMAEvolutionStrategy simply keeps resampling until all solutions are within the bounds, it ends up iterating forever with high dimensional solutions (in theory it'll eventually sample solutions within the bounds but it's taking too long).

Would it be possible to change the implementation such that it only resamples for a fixed number of iterations and clip the solution within the bounds after that?

Steps to Reproduce

The following code reproduce the above issue:

import fire
import numpy as np

from threadpoolctl import threadpool_limits
from ribs.emitters import EvolutionStrategyEmitter
from ribs.emitters.opt import CMAEvolutionStrategy
from ribs.archives import GridArchive
from ribs.schedulers import Scheduler
from ribs._utils import readonly


class DebugCMAEvolutionStrategy(CMAEvolutionStrategy):
    # Limit OpenBLAS to single thread. This is typically faster than
    # multithreading because our data is too small.
    @threadpool_limits.wrap(limits=1, user_api="blas")
    def ask(self, lower_bounds, upper_bounds, batch_size=None):
        """Samples new solutions from the Gaussian distribution.

        Args:
            lower_bounds (float or np.ndarray): scalar or (solution_dim,) array
                indicating lower bounds of the solution space. Scalars specify
                the same bound for the entire space, while arrays specify a
                bound for each dimension. Pass -np.inf in the array or scalar to
                indicated unbounded space.
            upper_bounds (float or np.ndarray): Same as above, but for upper
                bounds (and pass np.inf instead of -np.inf).
            batch_size (int): batch size of the sample. Defaults to
                ``self.batch_size``.
        """
        if batch_size is None:
            batch_size = self.batch_size

        self._solutions = np.empty((batch_size, self.solution_dim),
                                   dtype=self.dtype)
        self.cov.update_eigensystem(self.current_eval, self.lazy_gap_evals)
        transform_mat = self.cov.eigenbasis * np.sqrt(self.cov.eigenvalues)

        # Resampling method for bound constraints -> sample new solutions until
        # all solutions are within bounds.
        remaining_indices = np.arange(batch_size)
        while len(remaining_indices) > 0:
            unscaled_params = self._rng.normal(
                0.0,
                self.sigma,
                (len(remaining_indices), self.solution_dim),
            ).astype(self.dtype)
            new_solutions, out_of_bounds = self._transform_and_check_sol(
                unscaled_params, transform_mat, self.mean, lower_bounds,
                upper_bounds)
            self._solutions[remaining_indices] = new_solutions
            print("new sol: ", new_solutions)

            # Find indices in remaining_indices that are still out of bounds
            # (out_of_bounds indicates whether each value in each solution is
            # out of bounds).
            remaining_indices = remaining_indices[np.any(out_of_bounds,
                                                         axis=1)]
            print("Remaining indices:", remaining_indices)
        return readonly(self._solutions)


def infinite_sampling(sol_size, seed):
    archive = GridArchive(
        solution_dim=sol_size,
        dims=[100, 50],
        ranges=[[10, 30], [0, 10]],
        seed=seed,
        dtype=np.float32,
    )

    bounds = [(1e-3, None) for _ in range(sol_size)]

    initial_solution = np.ones(sol_size)
    emitters = [
        EvolutionStrategyEmitter(
            archive,
            es=DebugCMAEvolutionStrategy,
            x0=initial_solution,
            sigma0=1,
            seed=seed,
            bounds=bounds,
            batch_size=10,
        )
    ]

    scheduler = Scheduler(
        archive,
        emitters,
    )

    sols = scheduler.ask()
    print(sols)

    return emitters, scheduler


if __name__ == "__main__":
    fire.Fire(infinite_sampling)

To reproduce the issue, run with python <script_name> 3000 42.

@lunjohnzhang lunjohnzhang added the bug Something isn't working label Oct 5, 2023
@btjanaka
Copy link
Member

btjanaka commented Oct 5, 2023

Hi @lunjohnzhang, thank you for the suggestions! I can confirm that I am able to reproduce the behavior. This seems somewhere between a bug report and a feature request.

The new behavior seems reasonable, but I'm not sure what the API should be for it. Perhaps there could be a bounds_iters parameter to CMAEvolutionStrategy.__init__ that is used in ask as part of the behavior you describe? bounds_iters could default to None to indicate the behavior should not be activated. If someone wants the behavior, they can pass bounds_iters in the es_kwargs of EvolutionStrategyEmitter.

Would you be willing to write a PR for this? No worries if not; I can always circle back to it later.

@lunjohnzhang
Copy link
Contributor Author

Thanks for the reply! The solution with bounds_iters makes sense to me and I will be happy to write a PR for it. I guess you would want this feature in other ES as well?

@btjanaka
Copy link
Member

btjanaka commented Oct 5, 2023

Yes, that would be great. Thanks!

btjanaka added a commit that referenced this issue Mar 14, 2024
## Description

<!-- Provide a brief description of the PR's purpose here. -->

A common error when using bounds is that CMA-ES or another ES can hang
due to resampling, as solutions that fall outside of the bounds need to
be resampled until they are within bounds. This PR adds a warning so
that users will at least know that this behavior is occurring. We are
still unclear how to deal with bounds properly, as it is also an open
research question. #392 has proposed clipping the solutions after a set
number of iterations of resampling but it is unclear if this is the best
solution.

## TODO

<!-- Notable points that this PR has either accomplished or will
accomplish. -->

- [x] Fix slight issue with how OpenAI-ES handles resampling
- [x] Add tests -> since this behavior is supposed to make the tests
hang, we just put this as a script in one of the tests that can be
manually run
- [x] Modify ESs

## Status

- [x] I have read the guidelines in

[CONTRIBUTING.md](https://github.com/icaros-usc/pyribs/blob/master/CONTRIBUTING.md)
- [x] I have formatted my code using `yapf`
- [x] I have tested my code by running `pytest`
- [x] I have linted my code with `pylint`
- [x] I have added a one-line description of my change to the changelog
in
      `HISTORY.md`
- [x] This PR is ready to go
btjanaka added a commit that referenced this issue Mar 14, 2024
## Description

<!-- Provide a brief description of the PR's purpose here. -->

A common error when using bounds is that CMA-ES or another ES can hang
due to resampling, as solutions that fall outside of the bounds need to
be resampled until they are within bounds. This PR adds a warning so
that users will at least know that this behavior is occurring. We are
still unclear how to deal with bounds properly, as it is also an open
research question. #392 has proposed clipping the solutions after a set
number of iterations of resampling but it is unclear if this is the best
solution.

## TODO

<!-- Notable points that this PR has either accomplished or will
accomplish. -->

- [x] Fix slight issue with how OpenAI-ES handles resampling
- [x] Add tests -> since this behavior is supposed to make the tests
hang, we just put this as a script in one of the tests that can be
manually run
- [x] Modify ESs

## Status

- [x] I have read the guidelines in

[CONTRIBUTING.md](https://github.com/icaros-usc/pyribs/blob/master/CONTRIBUTING.md)
- [x] I have formatted my code using `yapf`
- [x] I have tested my code by running `pytest`
- [x] I have linted my code with `pylint`
- [x] I have added a one-line description of my change to the changelog
in
      `HISTORY.md`
- [x] This PR is ready to go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants