New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ETA On Compare Models #2428
Comments
I think this is not possible right now since there is no way to probe the underlying models on the amount of time remaining (limitation of scikit-learn). @Yard1 - any comments on this request? |
Found something here |
There is something called as scitime as mentioned here
|
This seems to be quite an invasive method to compute run time. You are essentially building models around the model that you really want to build. Also, not sure this applies to all sklearn functionality such as Would prefer a method that is more tightly coupled with sklearn functionality. Also, would like to hear the thoughts of the other core developers as well. |
Understandable. Lets keep this feature request open for now for someone else to come up with some idea on its execution. |
Yeah, none of the available solution would work here (or indeed, for all of our models). |
Is it possible to estimate once the training has started and some time has elapsed. For example 10% got completed in 60 seconds, so approximately ETA to reach 100% can be around 600 seconds more or less? |
Ill put this on sklearn repo as well and let see what they have to say about it. Sharing on stackoverflow as well for computer science community to suggest something. |
This is not true since training is not a linear process. Moreover, what do you mean by training is 10% complete? This may work in case of Deep Neural Networks where you have epochs, but that is not the case for machine learning models (for the most part). |
This may be the best path forward. This should be handled in the sklearn repo itself rather than as a wrapper around the models outside sklearn. That would be the most sustainable way to do this (if it is even possible). |
Reply from scikitlearn:
|
Is your feature request related to a problem? Please describe.
Often when dataset is large or it has more number of features, the compare_models() functions takes a lot of time, sometimes hours on a regular 4GB RAM, SSD storage laptop without GPU. It's often not clear how much time it will take which makes it bit difficult to plan certain things and one may feel stuck.
Describe the solution you'd like
If we can have an ETA (estimated time remaining) on compare_models(), it will be very helpful.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: