Bias from independent within-trial restarts when instances are not uniform #2257

nikohansen · 2023-12-20T16:48:28Z

Following up on #1117 (comment). In example_experiment2.py we conduct independent within-trial restarts on each instance thereby exhausting the budget. The expected number of repetitions depends on the instance success rate. In example_experiment3.py we (plan to) have uniform instance repetitions where, depending on number of successes over all instances, each instance is repeated the same number of times to exhaust the budget.

When instances do not "behave" similarly (the original assumption is that they do), we currently have more repetitions on the difficult instances which does not seem right: assuming for simplicity two different instances, one with success rate one, the other with success rate zero and a solver terminating these instances with the evals1 and evals0 number of evaluations, respectively, and the experimentation budget > evals0. Then, with within-trial restarts, we compute the expected runtime over both instances (AKA the evaluations per single success) to be budget + evals1, whereas with uniform instance repetitions we get evals0 + evals1. The discrepancy becomes the larger the earlier the solver terminates on unsuccessful instances (the smaller evals0) and the larger the experimentation budget. The observation under within-trial restarts contradicts the idea that we have a budget-independent setup and the observation is dominated by an (somewhat arbitrary) input parameter, the budget, which renders the observation somewhat meaningless. Consequently, the results based on with-trial restarts crucially depend on the assumption that instance "behave similarly" (are uniform). With uniform instance repetitions as in example_experiment3.py this assumption is considerably less crucial.

The second advantage of uniform repetitions is that it can more easily recover the observation like from the old setup than vice versa. We also collect more runtimes for easier targets which are bypassed in within-trial restarts when they were already reached with a previous run.

A third advantage of uniform repetitions is that we can infer the number of independent restarts from the displayed number of instances in the figure. EDIT: is this relevant when we do not have the success rate?

Disadvantages of making success-dependent uniform instance repetitions:

before each "restart", we need to read in all data to decide which instances to repeat.
we can not chose a specific initial solution for the first run only (on each instance). This requires one within-trial restart or breaking the experimental setup by running some (or one) instance(s) with a different initialization and some "magic" to choose the right instance as the first for simulated restarts. This may be considered a feature, as it requires to make this distinction more explicit.

The text was updated successfully, but these errors were encountered:

nikohansen · 2024-05-03T08:50:09Z

By introducing duplicates in DataSet._evals_appended_compute, we draw instance numbers uniformly at random rather than instances appearances to simulate runtimes (in pproc.DataSet.evals_with_simulated_restarts). This is however not a remedy when we already have with-trial restarts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias from independent within-trial restarts when instances are not uniform #2257

Bias from independent within-trial restarts when instances are not uniform #2257

nikohansen commented Dec 20, 2023 •

edited

nikohansen commented May 3, 2024

Bias from independent within-trial restarts when instances are not uniform #2257

Bias from independent within-trial restarts when instances are not uniform #2257

Comments

nikohansen commented Dec 20, 2023 • edited

nikohansen commented May 3, 2024

nikohansen commented Dec 20, 2023 •

edited