Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why get_incumbent_id selects the incumbent only from runs with max_budget? #102

Open
2533245542 opened this issue Jan 31, 2021 · 1 comment

Comments

@2533245542
Copy link

Isn't it weird that get_incumbent_id() only finds the incumbent based on the losses from max_budget runs?

I think the incumbent should be selected from all configs, regardless of budgets.

Let's say we use epoch as budget.

If

config A has a loss of 5 after running 20 epochs.
config B has a loss of 3 after running 10 epochs.

get_incumbent_id() will say config A is the incumbent. Then the user will build a model with config A using 20 epochs, but in fact, the user should build the model with config B, because config A is overfitting with 20 epochs.

I also suggest something like get_incumbent_id_and_budget() so users can know the optimal hyperparameter value combination as well as the optimal budget for building their model.

@sfalkner
Copy link
Collaborator

The answer to your question depends on your problem. The assumption in this implementation is that a larger budget is more reliable, and that the goal is to optimize for the largest budget. The model will at some point disregard evaluations on the smaller budgets and only focus on the largest one. This is why the default behavior is the one you observed.

You might use BOHB to tune a neural network, where larger budget does not necessarily mean better performance, i.e. early stopping boosts performance. But if you consider a problem where the noise depends on the budget, e.g. the number of CV folds, things are different. There there a higher budget equals a higher fidelity of the evaluation, meaning that it is more trustworthy.

You can use get_incumbent_trajectory to get your desired behavior.
There are the two flag to this function called bigger_is_better and 'non_decreasing_budget'. If you set both to False, it will give you a dict, where the last entry should be config with the best loss ever seen.

Please feel free to implement get_incumbent_id_and_budget() and open a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants