You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideally, I think, operations that take a lot of time should both:
have a way of showing progress, and
be interruptable.
It was previously discussed at least in #78, #7574, #7596.
Regarding progress, in #7574@amueller proposed not bake progress bar in, but rather add callbacks. In #7596@denis-bz suggested to have callbacks that are passed locals(), which is an interesting idea. I also saw that fit() in GradientBoostingClassifier has the monitor parameter, which makes showing progress bars easy.
In #7596, @denis-bz suggested callbacks can also be used to interrupt computation. In our project, Orange (https://github.com/biolab/orange3), we do something similar: sometimes, where there is no other available mechanism, we raise a BaseException inside a callback to interrupt running threads.
Orange uses scikit-learn a lot and lack of callbacks in scikit-learn makes showing progress or interrupting hard (we'd like to allow stopping of running computations). For now, we have to resort to hacks. For example, inour Neural Network widget, we subclassed scikit-learn NNs and added a callback on n_iter_ change (biolab/orange3#2958)
We would like to help in implementing callbacks, but first, we are asking if you would even consider having something similar to GradientBoostingClassifiermonitor in the other classes. What do you think?
Then, we could try thinking of an interface together and slowly start adding it to certain classed.
The text was updated successfully, but these errors were encountered:
Hello Marko! Without reviewing the past discussion of these topics (including perhaps my past opinions which may not agree with the present), I think callbacks are a Useful Thing, and have become expected functionality in machine learning libraries.
I don't think a parameter to fit is necessarily in accordance with how we like to design things these days, and we'd probably choose to make it a class parameter, especially if it might be used for early stopping and hence be deemed a hyperparameter.
I also think callbacks are a good alternatives to us receiving ad-hoc contributions of logging / progress meters.
Some risks:
it will introduce an additional parameter, but perhaps this is a small cost. will it potentially introduce more than one parameter?
it may reduce performance, but I assume this will be negligible
it may make introducing parallelism to some implementations harder
it's hard to know what to pass to a callback. Ensuring backwards compatibility of the callback's args, consistency across estimators, and up-to-date documentation, presents as potentially a big maintenance cost. We may want to think about how to make this usable but maintainable. This will take some prototyping and case studies.
it may take a long time to review and merge many such implementations
Ideally, I think, operations that take a lot of time should both:
It was previously discussed at least in #78, #7574, #7596.
Regarding progress, in #7574 @amueller proposed not bake progress bar in, but rather add callbacks. In #7596 @denis-bz suggested to have callbacks that are passed locals(), which is an interesting idea. I also saw that
fit()
inGradientBoostingClassifier
has themonitor
parameter, which makes showing progress bars easy.In #7596, @denis-bz suggested callbacks can also be used to interrupt computation. In our project, Orange (https://github.com/biolab/orange3), we do something similar: sometimes, where there is no other available mechanism, we raise a BaseException inside a callback to interrupt running threads.
Orange uses scikit-learn a lot and lack of callbacks in scikit-learn makes showing progress or interrupting hard (we'd like to allow stopping of running computations). For now, we have to resort to hacks. For example, inour Neural Network widget, we subclassed scikit-learn NNs and added a callback on
n_iter_
change (biolab/orange3#2958)We would like to help in implementing callbacks, but first, we are asking if you would even consider having something similar to
GradientBoostingClassifier
monitor
in the other classes. What do you think?Then, we could try thinking of an interface together and slowly start adding it to certain classed.
The text was updated successfully, but these errors were encountered: