Issue with `JoblibParallelization` #537

cheginit · 2024-01-01T18:03:56Z

The current implementation of JoblibParallelization that uses an already instantiated Parallel can lead to some issues such as this one. A better approach would be to take as input all Parallel args and instantiate it in the __call__ method. I tested this approach, and it works without any issue. I can open a PR if interested.

The text was updated successfully, but these errors were encountered:

blankjul · 2024-01-10T03:58:23Z

Thanks for your feedback! Can you provide a small example here in this issue how the interface would look like?
A PR works too of course.

cheginit · 2024-01-10T15:21:46Z

Sure. This is what I've been using:

class JoblibParallelization:
    def __init__(
        self,
        n_jobs: int = -1,
        backend: Literal["loky", "threading", "multiprocessing"] = "loky",
        return_as: Literal["list", "generator"] = "list",
        verbose: int = 0,
        timeout: float | None = None,
        pre_dispatch: str | int = "2 * n_jobs",
        batch_size: int | Literal["auto"] = "auto",
        temp_folder: str | Path | None = None,
        max_nbytes: int | str | None = "1M",
        mmap_mode: Literal["r+", "r", "w+", "c"] | None = "r",
        prefer: Literal["processes", "threads"] | None = None,
        require: Literal["sharedmem"] | None = None,
        *args: Any,
        **kwargs: Any,
    ) -> None:
        self.n_jobs = n_jobs
        self.backend = backend
        self.return_as = return_as
        self.verbose = verbose
        self.timeout = timeout
        self.pre_dispatch = pre_dispatch
        self.batch_size = batch_size
        self.temp_folder = temp_folder
        self.max_nbytes = max_nbytes
        self.mmap_mode = mmap_mode
        self.prefer = prefer
        self.require = require
        super().__init__()

    def __call__(
        self,
        f: Callable[..., Any],
        X: Iterable[Any],
    ) -> list[Any] | Generator[Any, Any, None]:
        with joblib.Parallel(
            n_jobs=self.n_jobs,
            backend=self.backend,
            return_as=self.return_as,
            verbose=self.verbose,
            timeout=self.timeout,
            pre_dispatch=self.pre_dispatch,
            batch_size=self.batch_size,
            temp_folder=self.temp_folder,
            max_nbytes=self.max_nbytes,
            mmap_mode=self.mmap_mode,
            prefer=self.prefer,
            require=self.require,
        ) as parallel:
            return parallel(joblib.delayed(f)(x) for x in X)

It can be easily instantiated without any arguments.

…imization#537 [skip ci]

blankjul self-assigned this Jan 10, 2024

blankjul added the feature request label Jan 10, 2024

cheginit pushed a commit to cheginit/pymoo that referenced this issue Jan 10, 2024

ENH: Rewrite JoblibParallelization following the discussion at anyopt…

a307d7b

…imization#537 [skip ci]

cheginit mentioned this issue Jan 10, 2024

ENH: Rewrite JoblibParallelization #541

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with `JoblibParallelization` #537

Issue with `JoblibParallelization` #537

cheginit commented Jan 1, 2024

blankjul commented Jan 10, 2024 •

edited

cheginit commented Jan 10, 2024

Issue with JoblibParallelization #537

Issue with JoblibParallelization #537

Comments

cheginit commented Jan 1, 2024

blankjul commented Jan 10, 2024 • edited

cheginit commented Jan 10, 2024

Issue with `JoblibParallelization` #537

Issue with `JoblibParallelization` #537

blankjul commented Jan 10, 2024 •

edited