Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter Types: Handling and Implementation #376

Open
1 task done
till-m opened this issue Nov 7, 2022 · 7 comments
Open
1 task done

Parameter Types: Handling and Implementation #376

till-m opened this issue Nov 7, 2022 · 7 comments

Comments

@till-m
Copy link
Member

till-m commented Nov 7, 2022

Why?
Support for non-float parameters types is by far the most-request feature that this package doesn't have as of now.

Why this issue?
I would like to explicitly discuss if and how to implement parameter types. In that sense this issue isn't a feature request, but intended to server as a space to collect discussions about this topic.

How?
For my perspective, Garrido-Merchán and Hernández-Lobato seems to make the most sense.
This means converting the parameters within the kernel: $$\tilde{k}( x_i, x_j)=k(T(x_i), T(x_j))$$ where $T$ acts on elements of $x_i$ according to their type.

What is necessary?
Essentially all of this "only" requires three functions to transform the parameters:

  • A function that converts the "canonical" representation of the parameters to the representation used by the kernel.
    • float and int parameters remain unchangend, one-hot-encoding is applied to categorical variables
  • A function that converts the kernel representation to the canonical representation, used whenever the user interacts with parameters (logs, optimizer.max(), etc)
    • float and int parameters remain unchangend, reverse the one-hot-encoding.
  • A function that converts the all-float parameter suggestions produced by _space.random_sample() and acq_max() into kernel representation.

Naturally, it requires changing a lot of other code, too; and particularly in ways that make everything a bit messier. Additionally, wrapping the kernel requires some slightly hacky python magic due to the way sklearn.gaussian_process' kernels are set up.

Alternatives
Instead of offering proper support, we could simply refer users to @leifvan's implementation here. Alternatively, we could set up a notebook that demonstrates his approach. I'm almost favouring this approach since I'm worried about cluttering the API too much.

Are you able and willing to implement this feature yourself and open a pull request?

  • Yes, I can provide this feature -- it's pretty much ready/functional too, I will push it soon and link it here.
@bwheelz36
Copy link
Collaborator

Hey, I must admit I've never had to use categorical parameters and as such never given it much thought :-P
Some very smart people wrote that thread! Agree we should document this more explicitly. Some thoughts:

  • It doesn't actually look like it would be that much additional clutter to the API? e.g. if we just supply a custom kernel and then update pbounds to support optional typing? However I also think just a notebook demo would be quite fine.
  • Also, this package can work with arbitrary kernels, which is not documented in the examples at the moment. So from this perspective having an example that demonstrated how to construct a custom kernel would be good (I also planned to do this in the noise example that I haven't written
  • If we follow the way this is implemented by @leifvan, we should use set_gp_params instead of bo._gp - I think it should work the same way

@till-m
Copy link
Member Author

till-m commented Nov 9, 2022

Hi @bwheelz36,

I think you're right about it not cluttering the (user-facing) API too much. Internally, however, there is a substantial back-and-forth happening when transforming the parameters, especially if we want to handle everything for the user -- at least I didn't find a way around this. I think the question is ultimately how "convenient" we want to make things for endusers.

As for my draft, see here. The code itself needs a lot of clean-up but you can see how it would work from a user-perspective.

@Chiu-Ping
Copy link

Hi @bwheelz36,

I think you're right about it not cluttering the (user-facing) API too much. Internally, however, there is a substantial back-and-forth happening when transforming the parameters, especially if we want to handle everything for the user -- at least I didn't find a way around this. I think the question is ultimately how "convenient" we want to make things for endusers.

As for my draft, see here. The code itself needs a lot of clean-up but you can see how it would work from a user-perspective.

Hello till,

This feature to support non-float type is not implemented in the master branch, right?
I'm trying to use the functionality on the master branch, but there are always some errors.
BTW, I used the command pip install bayesian-optimization to install, it cannot process the non-float type according to your example, so do you have any suggestions? thanks a lot.

@bwheelz36
Copy link
Collaborator

Hi, no this feature is not on the main branch yet.
I think you should be able to install the fork you reference above like this:
pip install git+https://github.com/till-m/BayesianOptimization/tree/parameter-types
Having said that I don't think this feature is complete yet.

@bwheelz36
Copy link
Collaborator

I guess it would be quite helpful to have someone test this right @till-m ?

@till-m
Copy link
Member Author

till-m commented Dec 20, 2022

yes, expect it to be a bit unstable, but I would love some feedback :)

@till-m
Copy link
Member Author

till-m commented Apr 25, 2023

Hi @Chiu-Ping, did you ever get around to testing the feature?

@till-m till-m mentioned this issue May 25, 2023
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants