Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mp_island starts its own ipyparallel_bfe even when constructed with an UDBFE [BUG] #148

Open
dalbabur opened this issue Jan 28, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@dalbabur
Copy link

Describe the bug
I'm passing a UDBFE with an initialized ipyparallel view to the island constructor. When initializing the population, it correctly uses the UDBFE. However, when evolving the island, it tries to create a new BFE with the default ipyparallel cluster!

To Reproduce

Define UDI and UDBFE

udi = pg.mp_island()

udbfe = pg.ipyparallel_bfe()
udbfe.init_view(client_kwargs={'profile':'my-profile'})
mybfe = pg.bfe(udbfe)

a = pg.pso_gen(gen=10)
a.set_bfe(mybfe)
algo = pg.algorithm(a)

Start islands. Evaluation does happen with the correct UDBFE

islands = [pg.island(udi = udi, algo = algo, prob = myprob, size=75, b = mybfe, r_pol = pg.fair_replace(0.0), s_pol = pg.select_best(0.0)) for _ in range(8)]

Error when evolving the islands:

RuntimeError: The asynchronous evolution of a pythonic island of type 'Multiprocessing island' raised an error:
multiprocessing.pool.RemoteTraceback: 
Traceback (most recent call last):
  File "/opt/conda/envs/myenv/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/envs/myenv/lib/python3.9/site-packages/pygmo/_py_islands.py", line 26, in _evolve_func_mp_pool
    new_pop = algo.evolve(pop)
  File "/opt/conda/envs/myenv/lib/python3.9/site-packages/pygmo/_py_bfes.py", line 470, in __call__
    ipyparallel_bfe._view = _make_ipyparallel_view(
  File "/opt/conda/envs/myenv/lib/python3.9/site-packages/pygmo/_ipyparallel_utils.py", line 14, in _make_ipyparallel_view
    rc = Client(*client_args, **client_kwargs)
  File "/opt/conda/envs/myenv/lib/python3.9/site-packages/ipyparallel/client/client.py", line 454, in __init__
    raise OSError(msg)
OSError: Connection file '~/.ipython/profile_default/security/ipcontroller-client.json' not found.
You have attempted to connect to an IPython Cluster but no Controller could be found.
Please double-check your configuration and ensure that a cluster is running.
@dalbabur dalbabur added the bug Something isn't working label Jan 28, 2024
@bluescarni
Copy link
Member

Hi @dalbabur

When you run an evolution on an mp_island, the algorithm state (which includes the mybfe object set via a.set_bfe(mybfe)) needs to be serialised and transmitted to the remote process spawned by mp_island. The algorithm will be deserialised in the remote process and evolution can then start.

My guess would be that during (de)serialisation of the mybfe object, the information about the custom setup of the ipyparallel view is not kept.

Is there any specific reason why you want to mix process-based serialisation with ipyparallel?

@dalbabur
Copy link
Author

dalbabur commented Feb 2, 2024

Oh I see, that makes sense. Then, theoretically, it would be possible by modifying __setstate__ and __getstate__, no?

The original reason I was interested in mixing process-based serialisation with ipyparallel was to offload part of the work to a local machine, while doing the more expensive fitness evaluations in a remote cluster. I've tried two ipyparallel clusters, and that works fine, but was wondering about other options, as increasing the number of ipyparallel nodes really slows things down.

@bluescarni
Copy link
Member

@dalbabur

After taking a look at the code, I realised my explanation was partly incorrect. Indeed, what is happening is that pygmo manages a global instance of the ipyparallel_view which is implicitly created on first usage of any ipyparallel-related functionality. So there is actually no ipyparallel_view stored in the mybfe object and nothing gets serialised and transmitted to the remote process.

What is happening instead is that the remote process has its own ipyparallel_view global object, which is created on-demand if and when it is used. The creation of the remote ipyparallel_view is done with default settings, and your custom options client_kwargs={'profile':'my-profile'} are not being used.

I see two possible solutions, both of which I think involve some modifications in the pygmo source code:

  1. either we give the user the possibility to execute some custom setup code whenever a new mp_island is created (so that you could execute init_view(client_kwargs={'profile':'my-profile'}) in this custom code snippet), or
  2. we change ipyparallel_bfe() so that each instance contains its own view and make sure that is it properly (de)serialised when it is pickled.

Personally I would prefer number 2). I initially wrote ipyparallel_bfe and ipyparallel_island to use a global ipyparallel_view because I was wary of potential performance issues when using multiple views, but in hindsight it was probably a premature optimisation mistake.

To be honest, we never got much user feedback for the ipyparallel-related functionality so it was never tweaked/touched/improved after the initial implementation.

If you have some familiarity with ipyparallel and would like to contribute to pygmo, PRs would be welcome :)

The relevant code would be here:

https://github.com/esa/pygmo2/blob/master/pygmo/_py_bfes.py#L321

I don't think it would be too much work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants