New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Add support for unpenalized logistic regression #6738
Comments
What's the problem with that approach? |
I assumed that it's inexact and slower than a direct implementation of unpenalized logistic regression. Am I wrong? I notice that setting import numpy as np
from sklearn.linear_model import LogisticRegression
x = np.matrix([0, 0, 0, 0, 1, 1, 1, 1]).T
y = [1, 0, 0, 0, 1, 1, 1, 0]
m = LogisticRegression(C = 1e200)
m.fit(x, y)
print m.intercept_, m.coef_ |
Yes this is to be expected as the problem becomes ill-posed when C is large. Iterative solvers are slow with ill-posed problems. In your example, the algorithm takes forever to reach the desired tolerance. You either need to increase |
@mblondel is there an alternative to "iterative solvers"? @Kodiologist why do you want this? |
You're asking why would I want to do logistic regression without regularization? Because (1) sometimes the sample is large enough in proportion to the number of features that regularization won't buy one anything and (2) sometimes the best-fitting coefficients are of interest, as opposed to maximizing predictive accuracy. |
Yes, that was my question. (1) is not true. It will always buy you a faster solver. (2) is more in the realms of statistical analysis, which is not really the focus of scikit-learn. I guess we could add this but I don't know what solver we would use. As a non-statistician, I wonder what good any coefficients are that change with a bit of regularization. |
I can't say much about (1) since computation isn't my forte. For (2), I am a data analyst with a background in statistics. I know that scikit-learn focuses on traditional machine learning, but it is in my opinion the best Python package for data analysis right now, and I think it will benefit from not limiting itself too much. (I also think, following Larry Wasserman and Andrew Gelman, that statistics and machine learning would mutually benefit from intermingling more, but I guess that's its own can of worms.) All coefficients will change with regularization; that's what regularization does. |
I'm not opposed to adding a solver without regularization. We can check what would be good, or just bail and use l-bfgs and check before-hand if it's ill-conditioned? Yes, all coefficients change with regularization. I'm just honestly curious what you want to do with them afterwards. |
Hey, |
Or statsmodels?
|
What solvers do you suggest to implement? How would that be different from the solvers we already have with C -> infty ? |
You could try looking at R or statsmodels for ideas. I'm not familiar with their methods, but they're reasonably fast and use no regularization at all. |
Yeah statsmodels does the job too if you use the QR algorithm for matrix inversion. My use case is around model interpretability. For performance, I would definitely use regularization. |
I don't think we need to add any new solver... Logistic regression doesn't enjoy a closed form solution, which means that statsmodel must use an iterative solver of some kind too (my guess would be iterative reweighted least squares, but I haven't checked). Setting |
we're changing the default solver to lbfgs in #10001 fwiw
|
For the folks that really want unregularised logistic regression (like myself). I've been having to settle with using statsmodels and making a wrapper class that mimics SKLearn API. |
Any updates on this? This is a big blocker for my willingness to recommend scikit-learn to people. It's also not at all obvious to people coming from other libraries that scikit-learn does regularization by default and that there's no way to disable it. |
@shermstats suggestions how to improve the documentation on that? I agree that it might not be very obvious. |
You can specify from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
import statsmodels.api as sm
X, y = make_classification(random_state=2)
lr = LogisticRegression(C=np.inf, solver='lbfgs').fit(X, y)
logit = sm.Logit(y, X)
res = logit.fit()
from sklearn.metrics import log_loss
log_loss(y, lr.predict_proba(X))
log_loss(y, res.predict(X))
So I would argue we should just document that you can get an unpenalized model by setting C large or to np.inf. |
I'd suggest adding to the docstring and the user guide |
R's
Why not add allow |
@Kodiologist I'm not opposed to adding |
|
If you feel it adds to discoverability then we can add it, and 3 is a valid point (though we can actually not really change that without deprecations probably, see current change of the solver). |
I don't have the round tuits for it; sorry. |
@Kodiologist at least you taught me an idiom I didn't know about ;) |
So open for contributors: add |
This sounds reasonable to me. I'd also suggest bolding the first sentence because it's legitimately that surprising for people coming from other machine learning or data analysis environments. |
@shermstats So @Kodiologist suggested adding |
Exactly! I have a stats background and have worked with many statistics people coming from R or even point and click interfaces, and this behavior is very surprising to us. I think for now that |
Sorry, which issue do you mean? We're switching to l-bfgs by default, and we can also internally switch the solver to l-bfgs automatically if someone specifies |
This issue, which refers to the iterative algorithm becoming very slow for large C. I'm not a numerical analysis expert, but if l-bfgs prevents it from slowing down then that sounds like the right solution. |
@shermstats yes, with l-bfgs this doesn't seem to be an issue. I haven't run extensive benchmarks, though, and won't have time to. If anyone wants to run benchmarks, that would be a great help. |
If penalty='none' is to be included, I suggest to add to the user guide the same warning about colinear X as for OLS (in particularly for one-hot encoded features). |
LinearRegression
provides unpenalized OLS, andSGDClassifier
, which supportsloss="log"
, also supportspenalty="none"
. But if you want plain old unpenalized logistic regression, you have to fake it by settingC
inLogisticRegression
to a large number, or useLogit
fromstatsmodels
instead.The text was updated successfully, but these errors were encountered: