Optimizing in a discrete configspace #1091

mallanos · 2024-01-19T00:02:09Z

Description

I want to optimize a function that takes in 3 float parameters. However, not all combinations of the 3 parameters could exist.
Is there a way to define the configspace as a pool of possible solutions, so smac samples configs as three-dimensional points from that pool?

Steps/Code to Reproduce

What I'm doing now is defining the config space in the regular way:

    def configspace(self) -> ConfigurationSpace:

        cs = ConfigurationSpace(name="myspace", seed=seed)
        x0 = Float("x0", (np.min(embedding), np.max(embedding)), default=-3)
        x1 = Float("x1", (np.min(embedding), np.max(embedding)), default=-4)
        x2 = Float("x2", (np.min(embedding), np.max(embedding)), default=5)

        cs.add_hyperparameters([x0, x1, x2])

        return cs

Then, I use the Ask-and-Tell interface to:

Ask for a config or point in the three-dimensional space
Find the closest existing point to the suggested point
Get the score or value associated with that point
Tell smac3 the resulting TrialValue and TrialInfo

for _ in range(search_iterations):
    info = smac.ask()
    assert info.seed is not None
    score, point = model.sample(info.config, ec=ec, seed=info.seed)
    value = TrialValue(cost=score, time=0.5)
    true_info = TrialInfo(config=Configuration(configuration_space=model.configspace,
            values={
                    'x0': float(point[0]),
                    'x1': float(point[1]),
                    'x2': float(point[2]),
                    }), seed=info.seed)
    smac.tell(true_info, value)

Expected Results

all_scores = [smac.runhistory.average_cost(config) for config in smac.runhistory.get_configs()]
I would expect the length of all_scores to be equal to the number of search_iterations, and no 'nan' values

Actual Results

When I inspect the results by running:
all_scores = [smac.runhistory.average_cost(config) for config in smac.runhistory.get_configs()]
I get several 'nan' scores and the number of values samples is greater than the max number of evaluations (search_iterations)

Versions

smac version 2.0.2

Thanks!

The text was updated successfully, but these errors were encountered:

alexandertornede · 2024-01-31T13:09:33Z

Hi @mallanos,

thanks for posting this!

The approach you performed has a conceptual problem from my perspective: There is no guarantee that the closest point (depending on the distance metric you use) actually has a comparable acquisition function value.

Moreover, without looking into this, I assume that the nan values arise from the fact that you do not provide a proper value for the configuration that you obtained by the ask call, which is why SMAC just fills it automatically with a nan value. I would need to look into this, to confirm this assumption, though.

Depending on the concrete constraints you want to apply to your search space, you can try to work with conditions (https://automl.github.io/ConfigSpace/main/api/conditions.html) and forbidden clauses (https://automl.github.io/ConfigSpace/main/api/forbidden_clauses.html). Just be aware that these are internally resolved by rejection sampling meaning that a large number of constraints or forbidden clauses can make the sampling of configurations slow.

Does that help?

alexandertornede added the discussion label Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing in a discrete configspace #1091

Optimizing in a discrete configspace #1091

mallanos commented Jan 19, 2024 •

edited

alexandertornede commented Jan 31, 2024

Optimizing in a discrete configspace #1091

Optimizing in a discrete configspace #1091

Comments

mallanos commented Jan 19, 2024 • edited

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

alexandertornede commented Jan 31, 2024

mallanos commented Jan 19, 2024 •

edited