Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

jamartinh · 2020-12-28T22:18:47Z

Describe the workflow you want to enable

At the current time, Estimators has a "fit" method.

The fact that this is a method makes hard to "publish" or "monitor" the progress and evoluton of the internal state of the Estimator.

Now, having a "fitter" method as a generator (with yield statement) will leverage huge advantages over having to use specialized and standardized callbacks having to infer the effect of this callback in the internal loops and with little control on the access of the Estimator internal state.

Describe your proposed solution

Define a protocol to have the option to include "fitter()" generator as well, so people can perform checks every iteration (or every n%x==0) iterations and then have the option to plot, monitor, and use the internal state and params of the Estimator.

for estimator_output in estimator.fitter(data):
    print(estimator_output)
    pyplot.plot(estimator_output.parameters[0])
    estimator.learning_rate/=0.9
    if estimator_output.error>12:
        break

This will provide a rich way of doing experiments and don't spend much time in creating sophisticated callback.
This will provide the possibility to stop iterating when user wants to without "ctrl+c"
This will help alleviate some black box feeling about "fit" methods

Describe alternatives you've considered, if relevant

I don't have come with a way of doing this without the "fitter" generator

Additional context

This will help even more debugging of current algorithm implementations
This can even be used to allow the interaction of two fitters comparing its behavior

Finally

In resume I consider it will change the way people see, understand and interact with machine learning algorithms implemented in the sklearn interface.

The text was updated successfully, but these errors were encountered:

glemaitre · 2020-12-29T11:34:19Z

I think that we are leaning towards callbacks to support these usecases: #16925

jamartinh · 2020-12-29T19:18:12Z

Hi @glemaitre thanks for pointing out the Callbacks initiative.
What I see is that the callbacks pattern born when programing languages and tools were not enough powerfull to allow for more interactive control over iterative algorithms and in that times the envisioned solution to allow a little interaction with the iterations or events were the use of callbacks as a remedy.

It is good to have a standard API on Callbacks, however with the new programing languages and tools we have these days, I think callbacks paradigm as a solution can be overcomed with direct interactivity.

It is not only that callbacks impose a way to interact or monitirize programmatically, that is, not truly interactively, but it is that of interactivity itself (e.g. REPL like debugging, tuning, monitoring).

I am afraid that generator like iterative algorithms provide by far a more powerful way to utilize, tune, optimize and debug than the use of programmatic callbacks.

I know of course many ML libraries are using this callbacks pattern and it has become a de facto standard, this is because people want to interact with the algorithms and the legacy of more obscure times is still the use of callbacks.

Why people switched from Tensorflow to Pytorch? because interactivity and the dynamic nature of pytorch using the dynamic nature of Python
Why Python is succeeding in research? because of simplicity and interactivity

In other words, callbacks in the sense they are being used today in many ML/DL/RL libraries are the legacy of the limitations of older tools not as dynamic and powerful as Python is today.

jnothman · 2021-01-18T01:28:59Z

Looking again at #16925, I think generators/iterators might be even trickier than callbacks in the multiprocessing context. We would be generating a series of events, with only limited assurances of order, which is much more like callbacks, in that you can't much rely on state. The obvious benefit of callbacks is that it is a much smaller change to our API.

jamartinh added the New Feature label Dec 28, 2020

cmarmo added the API label Jan 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

jamartinh commented Dec 28, 2020

glemaitre commented Dec 29, 2020

jamartinh commented Dec 29, 2020

jnothman commented Jan 18, 2021

Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

Outside functionality: add a "fitter" generator as an alternative to "fit" method #19076

Comments

jamartinh commented Dec 28, 2020

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Finally

glemaitre commented Dec 29, 2020

jamartinh commented Dec 29, 2020

jnothman commented Jan 18, 2021