Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow users to specify schedule for re-classification of all points #175

Closed
kjappelbaum opened this issue May 31, 2021 · 0 comments · Fixed by #177
Closed

allow users to specify schedule for re-classification of all points #175

kjappelbaum opened this issue May 31, 2021 · 0 comments · Fixed by #177
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request feature_request priority

Comments

@kjappelbaum
Copy link
Owner

Feature description

If the GPR is off in the first iterations, bad things can happen. We try to warn users using the cross-validation but this is not practical in all cases. One typical issue I've been seeing now a few times is that the code discards a point (correctly, if one considers the model predictions at that point) that then ends up dominating some points that have been classified as Pareto optimal.

The reason this can happen is that points that have been discarded are never considered again---which makes a lot of sense to keep things efficient but can lead to such errors.

What we could do to fix this, is to allow uses to specify a schedule to reclassify the full space, and by default, I'd run this at the end.

Implementation idea

I would use the "scheduling function" mechanism we also use for hyperparameter optimization. I think the default scheduling function should be exponentially spaced and run the re-classification at the end of (when no unclassified points are left).

For the "reclassification", I'd simply delete all existing classification markers (except the ones for "sampled") and then run the _classify function on it.

We certainly need to explain this "issue" in the docs.

I think this would deserve a thorough benchmarking. Do you have any good datasets we could use for testing in mind @byooooo ?

@kjappelbaum kjappelbaum added documentation Improvements or additions to documentation enhancement New feature or request feature_request priority labels May 31, 2021
@kjappelbaum kjappelbaum linked a pull request Jun 14, 2021 that will close this issue
10 tasks
@kjappelbaum kjappelbaum self-assigned this Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request feature_request priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant