Callbacks API #16925

rth · 2020-04-14T22:16:39Z

This is a first iteration of an API for callbacks which could be used e.g. for monitoring progress of calculations, and convergence as well as, potentially, early stopping. The goal of this PR is to experiment with callbacks, which would likely serve as a basis for a SLEP.

As proposed in the PR documentation, to implement a callback API, in their current iteration, an estimator should,

at the beginning of fit either explicitly call self._fit_callbacks(X, y) or use self._validate_data(X, y) which
makes a self._fit_callbacks call internally.
For iterative solvers call self._eval_callbacks(n_iter=.., **kwargs) at
each iteration, where kwargs keys must be part of supported callback
arguments (cf. list below). The questions is whether we can meaningfully standardize parameters passed as kwargs, just passing locals() won't do.

User defined callbacks must extend the sklearn._callbacks.BaseCallback
abstract base class.

For instance some callbacks based on this PR are implemented in the
sklearn-callbacks package (see readme for detailed examples),

Progress bars #7574 #78 #10973

from sklearn_callbacks import ProgressBar

X, y = make_classification(n_samples=500000, n_features=200, random_state=0)

pipe = make_pipeline(
    SimpleImputer(),
    make_column_transformer(
        (StandardScaler(), slice(0, 80)),
        (MinMaxScaler(), slice(80, 120)),
        (StandardScaler(with_mean=False), slice(120, 180)),
    ),
    LogisticRegression(),
)

pipe._set_callbacks(ProgressBar())
pipe.fit(X, y)

Determining which callback originates from which estimator is actually non trivial (and I haven't even started dealing with parallel computing). Currently I'm re-building a separate approximate computational graph for pipelines etc. Anyway once it's done (in a separate package), this could be used to animate model training on a graph (similar to what dask.diagnostics does) or say an HTML repr of pipelines by @thomasjpfan via some jupyter widget.

Monitoring convergence #14338 #8994 (comment)

Having callbacks is also quite useful to monitor model convergence,

e.g.

from sklearn_callbacks import ConvergenceMonitor

X, y = make_regression(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

conv_mon = ConvergenceMonitor("mean_absolute_error", X_test, y_test)

pipe = make_pipeline(StandardScaler(), Ridge(solver="sag", alpha=1))
pipe._set_callbacks(conv_mon)
_ = pipe.fit(X_train, y_train)
conv_mon.plot()

for now this only works for a small subset of linear models. One reason is that in the iteration loop, the solver must provide enough information to be able to reconstruct the model, which is not always the case. For instance for linear models it would be params + coef + intercept, but even then the definition of coef varies significantly across linear model (and whether we fit_intercept or not etc).

Early stopping #10973

The idea is that callbacks should be able to interrupt training: e.g. because some evaluation metric does no longer decrease on a validation set, cf figure above or due to other user defined reasons. This part is not yet included in this PR.

TODO

get some prelimiary feedback
study libraries with callbacks (e.g. dask, keras, pytorch etc)
add support for early stopping
add callbacks to more estimators and experiment with practical use cases in sklearn-callbacks
write SLEP

amueller · 2020-06-02T20:29:40Z

this is super awesome. Are you thinking about working further on this? Working on logging and callbacks could be a cool topic for the MLH fellowship thing. As a first pass we could do 'only' logging though?

rth · 2020-06-02T21:18:14Z

Yes, I would still like to work on this but I don't have much availability at the moment.

I think working on callbacks as part of MLH would be nice, and in particular logging would indeed be a good start. Using callbacks with logging handler would IMO be better than having conditional print or logging.info everywhere. The question is what would be the next steps on this.

I marked this as WIP, but basically it's a minimal working implementation, where the API should be sufficient for logging. Then it would need to be applied to all estimators and replace our current logging approach.

If I had to change anything here it might be make callback method names a bit closer to keras callbacks. In the end I'm not sure SLEP is ideal for this, maybe introducing it as private feature, and incrementally improving it, as we did for estimator tags, might be better as it's hard to plan all of it ahead. None of the additions should have an impact on users at present.

Even a superficial review would be much appreciated, cc @NicolasHug @thomasjpfan @glemaitre

NicolasHug

Nice stuff!

Made a few comments but I mostly have questions at this point

NicolasHug · 2020-06-03T13:53:07Z

pyproject.toml

+[tool.black]
+line-length = 79


Yeah, I don't see the point in manually formatting code anymore for new files. It shouldn't hurt even if we are not using everywhere..

NicolasHug · 2020-06-03T14:07:44Z

sklearn/_callbacks.py

+}
+
+
+def _check_callback_params(**kwargs):


Shouldn't we let each callback independently validate its data?

My question might not make sense but I don't see this being used anywhere except in the tests

Shouldn't we let each callback independently validate its data?
My question might not make sense but I don't see this being used anywhere except in the tests

Yes, absolutely each callback validates its data. But we also need to enforce that callbacks do follow the documented API in tests. For instance, that no undocumented parameters are passed etc which requires this function.

Third party callbacks could also use this validations function, similarly to how we expose check_array.

NicolasHug · 2020-06-03T14:13:14Z

sklearn/base.py

+        In the case of meta-estmators, callbacks are also set recursively
+        for all child estimators.


Thoughts on doing this vs letting users set callbacks on sub-estimator instances?

what about e.g. early stopping when we ultimately support this?

I added a deep=True option to allow disabling recursion for meta-estimators, which can certainly be useful in some cases.

In most cases though, I don't see users manually setting callbacks for each individual estimator in a complex pipeline..

NicolasHug · 2020-06-03T14:18:11Z

sklearn/linear_model/_sag_fast.pyx.tp

+            if callbacks is not None:
+                with gil:
+                    _eval_callbacks(callbacks, n_iter=n_iter, coef=weights_array,
+                                    intercept=intercept_array)


I remember a strange behavior during the paris sprint last year (I think it was with @pierreglaser and @tomMoral ?) where the GIL was acquired in a condition like this, and even when the condition was always False the code was significantly slower.

Might be something to keep in mind

It's clearly an issues with parallel code cython/cython#3554 (and I'm not sure how to handle parallel code with callbacks so far).

However acquiring GIL in long running loops occasionally is beneficial as users can't interrupt calculation with Ctrl+C otherwise. So acquiring GIL at the end of each epoch would actually solve a bug here #9136 (comment)

Will switch to acquire the GIL at the end of each epoch even if callback is None.

NicolasHug · 2020-06-03T14:20:07Z

sklearn/linear_model/_ridge.py

@@ -103,6 +104,8 @@ def _mv(x):
                # old scipy
                coefs[i], info = sp_linalg.cg(C, y_column, maxiter=max_iter,
                                              tol=tol)
+        if callbacks is not None:


maybe you can remove this since the is not None check is also done in _eval_callbacks

NicolasHug · 2020-06-03T14:22:43Z

sklearn/base.py

+
+    def _eval_callbacks(self, **kwargs):
+        """Call callbacks, e.g. in each iteration of an iterative solver"""
+        from ._callbacks import _eval_callbacks


why lazy import?

NicolasHug · 2020-06-03T14:23:30Z

sklearn/_callbacks.py

+    if callbacks is None:
+        return


Shouldn't we rely on callbacks being an empty list instead of subcasing with None?

Or maybe you are anticipating a future where callbacks=None would be a default argument to estimators?

Yes, we switched to the case when callbacks=None when missing.

NicolasHug · 2020-06-03T14:26:23Z

sklearn/tests/test_callbacks.py

+    assert callback.n_calls == 0
+    estimator.fit(X, y)
+    if callback.n_fit_calls == 0:
+        pytest.skip("callbacks not implemented")


otherwise assert it's equal to 1?

NicolasHug · 2020-06-03T14:32:04Z

sklearn/tests/test_callbacks.py

+
+
+def check_has_callback(est, callback):
+    assert getattr(est, "_callbacks", None) is not None


is this different from hasattr? Or can the attribute exist and be None for some reason?

Yes, reworded more clearly as,

assert hasattr(est, "_callbacks") and est._callbacks is not None

since None could be equivalent to [], just to be sure that this is not happening.

NicolasHug · 2020-06-03T14:34:53Z

sklearn/decomposition/_nmf.py

@@ -1335,7 +1345,7 @@ def transform(self, X):
            beta_loss=self.beta_loss, tol=self.tol, max_iter=self.max_iter,
            alpha=self.alpha, l1_ratio=self.l1_ratio, regularization='both',
            random_state=self.random_state, verbose=self.verbose,
-            shuffle=self.shuffle)
+            shuffle=self.shuffle, callbacks=getattr(self, '_callbacks', []))


I feel like we should either decide that

no callbacks means an empty list

no callabacks means None and having callbacks means a non-empty list

but it seems that the code is mixing both right now?

If we ultimately plan on having callbacks=None as a default param then the latter would be more appropriate?

Yes, you are right let's go with callbacks=None everywhere, and just let the eval function handle it.

Actually, here we would still need to do provide the default option to getattrs: getattr(self, '_callbacks', None) and that's not really much more readable than getattr(self, '_callbacks', []) so both would work..

rth · 2020-06-04T15:33:30Z

Thanks @NicolasHug ! I think I have addressed your comments, please let me know if you have other questions.

I have updated the required callbacks method names, inspired by Keras,

      class MyCallback(BaseCallback):

        def on_fit_begin(self, estimator, X, y):
            ...
        def on_iter_end(self, **kwargs):
            ...

which I think is more explicit than earlier names.

what about e.g. early stopping when we ultimately support this?

The plan for early stopping so far is,

in on_fit_begin (which is also called inside self._validate_data) return X, y, sample_weight with the training data, excluding the validation set. Store the validation set on the callback.
in on_iter_end compute the validation score (this is actually not easy depending on what information we have when the callback is called), and return a not None value to interrupt the training loop.

For a complex pipeline, with callbacks set recursively, clearly we need to do this only for estimator where it makes sense, but I think it should be doable. In any case this can probably done in a follow up PR. For now I put in the documentation that the return value of callback methods is ignored.

rth · 2020-06-04T16:12:37Z

BTW, regarding logging and #17439 (cc @thomasjpfan, @adrinjalali ), there is an example of logging callback in sklearn-callbacks repo. For instance, this example

from sklearn.compose import make_column_transformer
from sklearn.datasets import make_classification
from sklearn.impute import SimpleImputer
from sklearn.linear_model import SGDClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn_callbacks import DebugCallback

X, y = make_classification(n_samples=10000, n_features=100, random_state=0)

pipe = make_pipeline(
    SimpleImputer(),
    make_column_transformer(
        (StandardScaler(), slice(0, 80)), (MinMaxScaler(), slice(80, 90)),
    ),
    SGDClassifier(max_iter=20),
)


pbar = DebugCallback()
pipe._set_callbacks(pbar)

pipe.fit(X, y)

would produce,

python examples/logging-pipeline.py 
2020-06-04 18:10:15,128 INFO     fit_begin Pipeline [...]
2020-06-04 18:10:15,129 INFO     fit_begin SimpleImputer()
2020-06-04 18:10:15,132 INFO     fit_begin SimpleImputer()
2020-06-04 18:10:15,137 INFO     fit_begin ColumnTransformer [...]
2020-06-04 18:10:15,137 INFO     fit_begin StandardScaler()
2020-06-04 18:10:15,143 INFO     fit_begin StandardScaler()
2020-06-04 18:10:15,145 INFO     fit_begin MinMaxScaler()
2020-06-04 18:10:15,148 INFO     fit_begin SGDClassifier(max_iter=20)
2020-06-04 18:10:15,151 INFO     iter_end n_iter=0, loss=246525.58905033383
2020-06-04 18:10:15,152 INFO     iter_end n_iter=1, loss=74070.67406258777
2020-06-04 18:10:15,154 INFO     iter_end n_iter=2, loss=45127.853390181925
2020-06-04 18:10:15,155 INFO     iter_end n_iter=3, loss=33741.79716248915
2020-06-04 18:10:15,156 INFO     iter_end n_iter=4, loss=26982.451195702135
2020-06-04 18:10:15,157 INFO     iter_end n_iter=5, loss=22581.941031038896
2020-06-04 18:10:15,158 INFO     iter_end n_iter=6, loss=20275.28466109527
2020-06-04 18:10:15,160 INFO     iter_end n_iter=7, loss=17646.77498466535
2020-06-04 18:10:15,161 INFO     iter_end n_iter=8, loss=16516.305096143267
2020-06-04 18:10:15,162 INFO     iter_end n_iter=9, loss=15059.036141226175
2020-06-04 18:10:15,163 INFO     iter_end n_iter=10, loss=13906.533547375711
2020-06-04 18:10:15,165 INFO     iter_end n_iter=11, loss=12996.828040061095
2020-06-04 18:10:15,166 INFO     iter_end n_iter=12, loss=12410.182898002984
2020-06-04 18:10:15,167 INFO     iter_end n_iter=13, loss=11715.624170487545
2020-06-04 18:10:15,168 INFO     iter_end n_iter=14, loss=11283.087560475777
2020-06-04 18:10:15,170 INFO     iter_end n_iter=15, loss=10744.568097461684
2020-06-04 18:10:15,171 INFO     iter_end n_iter=16, loss=10361.116776342227
2020-06-04 18:10:15,172 INFO     iter_end n_iter=17, loss=9946.303790908443
2020-06-04 18:10:15,173 INFO     iter_end n_iter=18, loss=9707.741506199667
2020-06-04 18:10:15,175 INFO     iter_end n_iter=19, loss=9390.37295290932

This can clearly be improved, e.g. linked with our verbose flag, and the discussion about logging API is still very relevant but I think building a logging solution internally on top of callbacks would help getting a consistent experience across the library which is not case currently with our verbose=1 + print approach.

thomasjpfan · 2020-06-04T16:49:55Z

This can get a tricky when we have logging in our cython or c code.

Having designed callbacks in skorch and reviewed callbacks in fastai, I think it is hard to come up with a complete list of items to pass to the callback. Fastai can pass the caller object directly to the callback and the callback can directly change the state of the model.

For our case, passing metrics should be good enough. For logging, we also commonly log the elapsed time and may have interesting formatting (Pipeline). For HistGradientBoosting, it has some custom metrics that is only related to itself:

Time spent computing histograms: 0.041s
Time spent finding best splits:  0.012s
Time spent applying splits:      0.024s
Time spent predicting:           0.002s

It also logs the number of leaves etc:

[1/10] 1 tree, 31 leaves, max depth = 12, in 0.011s
[2/10] 1 tree, 31 leaves, max depth = 13, in 0.010s

I would not think tree or leaves belong to as a kwarg argument in the callback.

rth · 2020-06-04T17:16:05Z

Thanks for the feedback @thomasjpfan !

Fastai can pass the caller object directly to the callback and the callback can directly change the state of the model

In the current PR we can pass the estimator object to the callback, but we can't indeed do that from C or Cython code since it's not available there. We could take locals() but it's unclear what to do with them in the callback. I agree that having a complete list of things to pass to the callback is difficult, particularly given that we have much more heterogeneous models than in neural net libraries.

For logging, we also commonly log the elapsed time

For elapsed time we would also need to add the on_fit_end method. I initially didn't want to do that, as that would mean much more changes all over the code base. We can determine the elapsed time by taking the difference between two on_fit_end in a pipeline but it's clearly not fail-proof (particularly as soon as we start doing things in parallel). I imagine we could also only add it to a few models where it's relevant (Pipeline, ColumnTransformer etc).

For HistGradientBoosting, it has some custom metrics that is only related to itself:
I would not think tree or leaves belong to as a kwarg argument in the callback.

yes, it's one of the challenge that models are really heterogeneous. I guess we would also add a on_log method and just take a string input as a workaround for those cases.

The thing is that if we don't use callbacks for logging, it means that we would have to do very similar things twice, once for logging and once for callbacks.

jnothman · 2020-06-25T14:52:06Z

We could pass a specified set of objects/data (targeted to logging), as well as mutable locals() for use at one's own risk...?

amueller · 2020-07-02T16:03:16Z

How would this interact with multiprocessing? I'm not sure I understand how you would implement grid-search logging with multiprocessing as callbacks.

jnothman · 2020-07-02T22:31:59Z

How would this interact with multiprocessing? I'm not sure I understand

how you would implement grid-search logging with multiprocessing as callbacks. I don't see what the problem is if the callback is picklable and sends its output to a shared resource.

amueller · 2020-07-08T20:28:20Z

@jnothman I guess it isn't clear to me how you would write to a shared resource with joblib.

rth · 2020-07-09T15:08:30Z

@amueller Yes, it's not that straightforward. One could send callback parameters to a shared queue from different processes. But then one needs an additional thread to process these callback events, and start/stop this thread at some point. For instance,

See example below,

from multiprocessing import Manager
from threading import Thread, Lock
import queue

from joblib import delayed, Parallel
from time import sleep


class Callback():
    def __init__(self, m):
        self.q = m.Queue()
        
    def on_iter_end(self, x):
        res = x**2
        self.q.put(f'callback value {x}')
        return res
    
    @staticmethod
    def process_callback_events(q, should_exit):
        """Process all events in the queue"""
        while True:
            try:
                print(q.get_nowait())
            except queue.Empty:
                print('queue empty')
                if should_exit.locked():
                    return
                else:
                    sleep(1)
    
    def start_processing_callback(self):
        thread_should_exit = Lock()
        thread = Thread(target=self.process_callback_events,
                             args=(self.q, thread_should_exit))
        thread.start()
        return thread_should_exit
    

m = Manager()
        
cbk = Callback(m)

callback_processing_stop = cbk.start_processing_callback()
res = Parallel(n_jobs=2)(
    delayed(cbk.on_iter_end)(x) for x in range(10)
)
callback_processing_stop.acquire(blocking=False)

print(res)

which would produce,

queue empty
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
callback value 0
callback value 1
callback value 2
callback value 3
callback value 4
callback value 5
callback value 6
callback value 7
callback value 8
callback value 9
queue empty

But then I'm sure there are plenty of edge cases with different joblib backends that would need handling (e.g. nested parallel blocks). And this would require re-designing the callbacks API somewhat.
In any case for progress bars, I'm not sure there is a way around starting an extra monitoring thread in the case when joblib is used. Though maybe it's not a big issue, for instance tqdm does that as far as I understand, and this will only run if the user explicitly activates the callback.

amueller · 2020-07-15T14:58:23Z

Do you think this is worth the complexity? It seems like a can of worms, but it also would open a lot of doors (does that qualify as mixed metaphors?).
I haven't looked into the logging alternative, which I assume would be somewhat simpler but much less powerful.

Is it plausible / feasible to make the actual usage mostly hidden from the user? I assume that could work if joblib maybe get some special hook? I'd rather not litter the sklearn code with callback locking code.

And is there an easy way for the callback to determine the backend?

We could also go to logging first if it's "easy enough" and then try to go to callbacks later once we figure out how to do it and whether it's worth it?

rth · 2020-07-15T15:52:16Z

I haven't looked into the logging alternative, which I assume would be somewhat simpler but much less powerful.

As far as I can tell the approach of logging in the case of multiprocessing would work very similarly with a queue (logging.QueueHandler) and monitoring thread (logging.QueueListener) #78 (comment) It's a bit more abstracted from the user, but with the same issue of needing to start/stop this monitoring thread either around each parallel section or somehow globally. And needing to account for all the joblib special cases. The implementation of logging.QueueListener is not very far from the above prototype.

Do you think this is worth the complexity? [...]
Is it plausible / feasible to make the actual usage mostly hidden from the user? I assume that could work if joblib maybe get some special hook? I'd rather not litter the sklearn code with callback locking code.

In terms of complexity, as long as it's isolated in the optional Callback object it might be not so bad. The above code can certainly be improved as well. The issue is more indeed the need to start/stop the callback/log monitoring thread around each parallel section which is not ideal. We should talk with joblib devs to see if this could be made more user friendly (both here and for logging).

amueller · 2020-07-15T16:08:52Z

You are in closer physical proximity to the joblib people ;)
I have no idea about joblib tbh but I figure delayed could accept a more complex object that has __setup__ and __teardown__ in addition to __call__?

It looks like there's no way around modifying each call to Parallel that contains logging, but I guess that would have been the case anyway.

Thinking about it a bit more, does that happen very often? You could do a bunch in this PR without it. It's not only GridSearchCV, right? I guess OVO and OVR and VotingClassifier are candidates?
How would we log in RandomForestClassifier? I guess we'd need the queue, right?

In other words, does it make sense to do a first solution where we don't handle the parallel case? I guess users will probably be miffed if we introduce and interface but it's not supported in GridSearchCV.

rth · 2020-07-15T16:43:28Z

With @thomasjpfan we discussed that maybe it could be simpler to start with implementing just ConvergenceMonitor (to monitor convergence on the train and validation sets) as the first simple callback in scikit-learn. That would only apply to the final classifier/regression, and might not need to deal with all the parallel stuff at first (though RandomForestClassifier or even linear models with OVR hmm). There the additional issue is more how does one reconstruct the full model state from the information available in the callback to evaluate it on the validation set. That is kind of related to the question of how to implement early stopping, which would also be not easy to support in all estimators in one go.

In general I'm all for small incremental PRs. Having this in the private API even without parallel support could already be quite useful. It's more of a question of how confident are we that we will be able to add support for joblib later if needed without massively changing the API, and that the chosen API is generally reasonable.

dbczumar · 2020-07-28T16:22:48Z

Hi folks, just chiming in here to note that a callbacks API would be very useful for projects that collect / aggregate metadata for model training routines. In particular, the MLflow project would benefit greatly from a callbacks API in the context of its upcoming scikit-learn "autologging feature" (discussed here: mlflow/mlflow#2050): callbacks would enable MLflow to patch in custom hooks that store per-epoch metrics for a given scikit-learn training session in a centralized tracking service.

We would also be very excited to see this capability introduced as a private API in an upcoming scikit-learn release, even if it does not apply to all classifiers or to parallel execution environments (joblib, Dask, etc), as alluded to in #16925 (comment).

mfeurer · 2020-10-06T15:19:42Z

Hi folks, I also briefly wanted to propose another use of a future callbacks API that would be very useful for projects that train ML models under a time budget, which is stopping the training if the time is up. This would be very handy in the Auto-sklearn project. For each evaluation of the ML model we give a time limit and end the process if the time limit is hit. With such a callback we could safely shut down the process ourselves and still use the partially trained model.

I'm not sure if this is use case will be considered (it's some form of early stopping I guess), but I just wanted to bring up another potential use case that wasn't discussed yet.

MJimitater · 2021-02-22T12:20:45Z

Hello folks, I love and strongly support the idea of callbacks in sklearn, thanks for the progress so far, is this still a WIP for other algorithms, such as GaussianProcessClassifier?

ogrisel · 2021-03-16T15:36:18Z

+1 for moving forward with this PR with a private-only callback registration API for now, possibly without specific support for the parallel case (as long as the callbacks themselves are picklable).

We already override the delayed function in scikit-learn, so @rth feel free to prototype something with it to make it possible to implement additional logic on the loky workers for callbacks that need access to a shared resources. I think we would need to prototype a specific use case to see if we really need to change things in joblib or not.

rth · 2021-12-17T09:20:24Z

Superseded by #22000

rth and others added 8 commits September 6, 2019 18:29

Adding some callbacks

b98f4c1

Merge remote-tracking branch 'upstream/master' into progress-bar

1fcc07e

Add callback file

7365435

Add documentation

0c5cfaa

Callbacks validation

7732ece

Add tests for set_callbacks

f236274

Add callbacks for _sag_fast

69e7255

Lint

0e78233

rth mentioned this pull request Apr 14, 2020

[Feature Request] Progress Bars #7574

Open

rth mentioned this pull request Apr 23, 2020

ENH Adds HTML visualizations for estimators #14180

Merged

rth marked this pull request as draft April 23, 2020 16:09

rth changed the title ~~WIP Callbacks API~~ Callbacks API Jun 2, 2020

Merge remote-tracking branch 'upstream/master' into progress-bar

40135f1

NicolasHug reviewed Jun 3, 2020

View reviewed changes

rth mentioned this pull request Jun 4, 2020

API Proposal for Logging #17439

Open

rth added 2 commits June 4, 2020 16:22

Address Nicolas' comments

9cf1272

A few more fixes

3ce8771

rth marked this pull request as ready for review June 4, 2020 14:33

rth added 2 commits June 4, 2020 17:20

Better callback method names

6012f6d

Update documentation

1c5cd13

Sphinx format fixes

6ba0fe9

remiadon mentioned this pull request Jun 7, 2020

HeatMap visualization of SLIM learning scikit-mine/scikit-mine#26

Closed

4 tasks

rth mentioned this pull request Jun 24, 2020

Use python logging to report on convergence progress it level info for long running tasks #78

Open

glemaitre mentioned this pull request Oct 2, 2020

Save training and vaildation loss in loss_curve_ in MLPClassifier and MLPRegressor with early_stopping #18507

Open

rth mentioned this pull request Nov 6, 2020

Model progress with event. #18773

Closed

glemaitre mentioned this pull request Dec 29, 2020

Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

Open

Base automatically changed from master to main January 22, 2021 10:52

thomasjpfan added the cython label Apr 13, 2021

NicolasHug mentioned this pull request May 24, 2021

Progress bar for SequentialFeatureSelector #20127

Closed

glemaitre mentioned this pull request Aug 3, 2021

Add custom callbacks after fitting at each step in Pipeline #20668

Closed

ogrisel mentioned this pull request Nov 19, 2021

ENH Speed up sgdclassifier example plot_sgd_early_stopping.py #21627

Closed

jeremiedbb mentioned this pull request Dec 16, 2021

[WIP] Callback API continued #22000

Draft

4 tasks

rth closed this Dec 17, 2021

nicole01101101zke mentioned this pull request Apr 24, 2023

L7.思考题 d X-lab2017/oss101#68

Closed

		In the case of meta-estmators, callbacks are also set recursively
		for all child estimators.



		def check_has_callback(est, callback):
		assert getattr(est, "_callbacks", None) is not None

		[tool.black]
		line-length = 79

Callbacks API #16925

Callbacks API #16925

Conversation

rth commented Apr 14, 2020 • edited

amueller commented Jun 2, 2020

rth commented Jun 2, 2020 • edited

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rth commented Jun 4, 2020 • edited

rth commented Jun 4, 2020 • edited

thomasjpfan commented Jun 4, 2020

rth commented Jun 4, 2020 • edited

jnothman commented Jun 25, 2020

amueller commented Jul 2, 2020

jnothman commented Jul 2, 2020 via email

amueller commented Jul 8, 2020 • edited

rth commented Jul 9, 2020 • edited

amueller commented Jul 15, 2020 • edited

rth commented Jul 15, 2020

amueller commented Jul 15, 2020

rth commented Jul 15, 2020

dbczumar commented Jul 28, 2020 • edited

mfeurer commented Oct 6, 2020

MJimitater commented Feb 22, 2021

ogrisel commented Mar 16, 2021 • edited

rth commented Dec 17, 2021

rth commented Apr 14, 2020 •

edited

rth commented Jun 2, 2020 •

edited

rth commented Jun 4, 2020 •

edited

rth commented Jun 4, 2020 •

edited

rth commented Jun 4, 2020 •

edited

amueller commented Jul 8, 2020 •

edited

rth commented Jul 9, 2020 •

edited

amueller commented Jul 15, 2020 •

edited

dbczumar commented Jul 28, 2020 •

edited

ogrisel commented Mar 16, 2021 •

edited