Suggestion: Add support for unpenalized logistic regression #6738

Kodiologist · 2016-04-30T15:11:53Z

LinearRegression provides unpenalized OLS, and SGDClassifier, which supports loss="log", also supports penalty="none". But if you want plain old unpenalized logistic regression, you have to fake it by setting C in LogisticRegression to a large number, or use Logit from statsmodels instead.

The text was updated successfully, but these errors were encountered:

mblondel · 2016-05-02T08:18:30Z

you have to fake it by setting C in LogisticRegression to a large number

What's the problem with that approach?

Kodiologist · 2016-05-02T13:12:54Z

I assumed that it's inexact and slower than a direct implementation of unpenalized logistic regression. Am I wrong?

I notice that setting C too high, as in the following, will cause LogisticRegression.fit to hang. But I don't know if this is a bug or just an inherent property of the algorithm and its implementation on a 64-bit computer.

import numpy as np
from sklearn.linear_model import LogisticRegression

x = np.matrix([0, 0, 0, 0,  1, 1, 1, 1]).T
y =           [1, 0, 0, 0,  1, 1, 1, 0]

m = LogisticRegression(C = 1e200)
m.fit(x, y)
print m.intercept_, m.coef_

mblondel · 2016-05-02T14:07:36Z

I notice that setting C too high, as in the following, will cause LogisticRegression.fit to hang

Yes this is to be expected as the problem becomes ill-posed when C is large. Iterative solvers are slow with ill-posed problems.

In your example, the algorithm takes forever to reach the desired tolerance. You either need to increase tol or hardcode max_iter.

amueller · 2016-10-11T02:54:00Z

@mblondel is there an alternative to "iterative solvers"?
You won't get exactly the unregularized option, right?

@Kodiologist why do you want this?

Kodiologist · 2016-10-11T02:58:52Z

You're asking why would I want to do logistic regression without regularization? Because (1) sometimes the sample is large enough in proportion to the number of features that regularization won't buy one anything and (2) sometimes the best-fitting coefficients are of interest, as opposed to maximizing predictive accuracy.

amueller · 2016-10-11T03:01:23Z

Yes, that was my question.

(1) is not true. It will always buy you a faster solver.

(2) is more in the realms of statistical analysis, which is not really the focus of scikit-learn. I guess we could add this but I don't know what solver we would use. As a non-statistician, I wonder what good any coefficients are that change with a bit of regularization.

Kodiologist · 2016-10-11T03:09:44Z

I can't say much about (1) since computation isn't my forte. For (2), I am a data analyst with a background in statistics. I know that scikit-learn focuses on traditional machine learning, but it is in my opinion the best Python package for data analysis right now, and I think it will benefit from not limiting itself too much. (I also think, following Larry Wasserman and Andrew Gelman, that statistics and machine learning would mutually benefit from intermingling more, but I guess that's its own can of worms.) All coefficients will change with regularization; that's what regularization does.

amueller · 2016-10-13T20:23:47Z

I'm not opposed to adding a solver without regularization. We can check what would be good, or just bail and use l-bfgs and check before-hand if it's ill-conditioned?

Yes, all coefficients change with regularization. I'm just honestly curious what you want to do with them afterwards.

alexcombessie · 2018-02-09T15:34:45Z

Hey,
What is the status on this topic? I'd be really interested in an unpenalized Logistic Regression. This way p-values will mean something statistically speaking. Otherwise I will have to continue using R 😢 for such use cases...
Thanks,
Alex

jnothman · 2018-02-10T13:28:13Z

Or statsmodels?

mblondel · 2018-02-12T15:33:38Z

What solvers do you suggest to implement? How would that be different from the solvers we already have with C -> infty ?

Kodiologist · 2018-02-13T16:36:11Z

What solvers do you suggest to implement? How would that be different from the solvers we already have with C -> infty ?

You could try looking at R or statsmodels for ideas. I'm not familiar with their methods, but they're reasonably fast and use no regularization at all.

alexcombessie · 2018-02-13T20:35:38Z

Yeah statsmodels does the job too if you use the QR algorithm for matrix inversion. My use case is around model interpretability. For performance, I would definitely use regularization.

mblondel · 2018-02-14T11:08:35Z

I don't think we need to add any new solver... Logistic regression doesn't enjoy a closed form solution, which means that statsmodel must use an iterative solver of some kind too (my guess would be iterative reweighted least squares, but I haven't checked). Setting C=np.inf (or equivalently alpha=0) should in principle work with our current solvers. My recommendation would be to switch to the L-BFGS or Newton-CG solver, since liblinear can indeed be very slow in this setting. Perhaps we can add a solver="auto" option and automatically switch to one of these when C=np.inf or equivalently penalty="none"?

jnothman · 2018-02-14T11:54:21Z

we're changing the default solver to lbfgs in #10001 fwiw

arose13 · 2018-04-20T14:19:42Z

For the folks that really want unregularised logistic regression (like myself). I've been having to settle with using statsmodels and making a wrapper class that mimics SKLearn API.

shermstats · 2018-11-11T23:57:03Z

Any updates on this? This is a big blocker for my willingness to recommend scikit-learn to people. It's also not at all obvious to people coming from other libraries that scikit-learn does regularization by default and that there's no way to disable it.

amueller · 2018-11-12T18:05:12Z

@shermstats suggestions how to improve the documentation on that? I agree that it might not be very obvious.
Does l-bfgs allow C=np.inf?

amueller · 2018-11-12T19:05:02Z

You can specify C=np.inf, though it'll give you the same result as C=large value. On the example I tried, it gave a better fit than statsmodel and statsmodel failed to converge with most other random seeds:

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
import statsmodels.api as sm

X, y = make_classification(random_state=2)
lr = LogisticRegression(C=np.inf, solver='lbfgs').fit(X, y)


logit = sm.Logit(y, X)
res = logit.fit()

Optimization terminated successfully.
         Current function value: 0.167162
         Iterations 10

from sklearn.metrics import log_loss
log_loss(y, lr.predict_proba(X))
log_loss(y, res.predict(X))

0.16197793224715606
0.16716164149746823

So I would argue we should just document that you can get an unpenalized model by setting C large or to np.inf.

amueller · 2018-11-12T19:15:58Z

I'd suggest adding to the docstring and the user guide
"The LogisticRegregression model is penalized by default. You can obtain an unpenalized model by setting C=np.inf and solver='lbfgs'."

Kodiologist · 2018-11-12T19:24:02Z

it gave a better fit than statsmodel and statsmodel failed to converge with most other random seeds

R's glm is more mature and may make for a better comparison.

I'd suggest adding to the docstring and the user guide
"The LogisticRegregression model is penalized by default. You can obtain an unpenalized model by setting C=np.inf and solver='lbfgs'."

Why not add allow penalty = "none" a la SGDClassifier?

amueller · 2018-11-12T20:01:57Z

@Kodiologist I'm not opposed to adding penalty="none" but I'm not sure what the benefit is to adding a redundant option.
And I think we'd welcome comparisons to glm. I'm not very familiar with glm so I'm probably not a good person to perform the comparison. However, we are optimizing the log-loss so there should really be no difference. Maybe they implement different solvers so having a benchmark would be nice.

Kodiologist · 2018-11-12T21:06:39Z

I'm not opposed to adding penalty="none" but I'm not sure what the benefit is to adding a redundant option.

It becomes clearer how to get an unpenalized model.
It becomes clearer to the reader what code that's using an unpenalized model is trying to do.
It allows sklearn to change its implementation of unregularized models in the future without breaking people's code.

amueller · 2018-11-12T21:11:56Z

If you feel it adds to discoverability then we can add it, and 3 is a valid point (though we can actually not really change that without deprecations probably, see current change of the solver).
Do you want to send a PR?

Kodiologist · 2018-11-12T21:15:29Z

I don't have the round tuits for it; sorry.

amueller · 2018-11-13T17:49:06Z

@Kodiologist at least you taught me an idiom I didn't know about ;)

amueller · 2018-11-13T17:52:08Z

So open for contributors: add penalty='none' as an option. Also possibly check what solvers support this / are efficient with this (liblinear is probably not) and restrict to those solvers.

shermstats · 2018-11-13T18:20:14Z

I'd suggest adding to the docstring and the user guide
"The LogisticRegregression model is penalized by default. You can obtain an unpenalized model by setting C=np.inf and solver='lbfgs'."

This sounds reasonable to me. I'd also suggest bolding the first sentence because it's legitimately that surprising for people coming from other machine learning or data analysis environments.

amueller · 2018-11-13T18:31:38Z

@shermstats So @Kodiologist suggested adding penalty="none" to make it more explicit, which would just be an alias for C=np.inf. It makes sense for me to make this more explicit in this way. Do you have thoughts on that?
Then that would be what's in the documentation. And I agree that bold might be a good idea.
I think for someone with a ML background this is (maybe?) expected, for someone with a stats background, this is seems very surprising.

shermstats · 2018-11-13T19:04:52Z

Exactly! I have a stats background and have worked with many statistics people coming from R or even point and click interfaces, and this behavior is very surprising to us. I think for now that penalty=None (not sure about "none" vs. None) is a good solution. In the future, we should have a separate solver that's called automatically for unpenalized logistic regression to prevent the issues that @mblondel described.

amueller · 2018-11-13T20:00:36Z

Sorry, which issue do you mean? We're switching to l-bfgs by default, and we can also internally switch the solver to l-bfgs automatically if someone specifies penalty='none' (often None is a special token we use for deprecated parameters, but we have stopped doing that. Still 'none' would be more consistent with the rest of the library).
We need solver="auto" anyway so changing the solver based on the penalty shouldn't be an issue.

shermstats · 2018-11-14T20:19:05Z

This issue, which refers to the iterative algorithm becoming very slow for large C. I'm not a numerical analysis expert, but if l-bfgs prevents it from slowing down then that sounds like the right solution. penalty='none' also sounds like the right way to handle this.

amueller · 2018-11-14T21:03:24Z

@shermstats yes, with l-bfgs this doesn't seem to be an issue. I haven't run extensive benchmarks, though, and won't have time to. If anyone wants to run benchmarks, that would be a great help.

lorentzenchr · 2019-01-02T20:57:28Z

If penalty='none' is to be included, I suggest to add to the user guide the same warning about colinear X as for OLS (in particularly for one-hot encoded features).

Kodiologist changed the title ~~Suggestion: Add support for unpenalized linear regression~~ Suggestion: Add support for unpenalized logistic regression Apr 30, 2016

colin-ni mentioned this issue May 1, 2016

[WIP] (Issue #6738) Added option for no regularization in LogisiticRegression(&CV). #6739

Closed

amueller added Easy Well-defined and straightforward way to resolve Documentation good first issue Easy with clear instructions to resolve help wanted labels Nov 12, 2018

amueller removed the good first issue Easy with clear instructions to resolve label Nov 13, 2018

NicolasHug mentioned this issue Dec 24, 2018

[MRG] Added penalty='none' to LogisticRegression #12860

Merged

adrinjalali removed the help wanted label Dec 25, 2018

jnothman closed this as completed in #12860 Jan 9, 2019

amueller mentioned this issue Jun 11, 2019

add section on regularization to logistic regression docs #14070

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Add support for unpenalized logistic regression #6738

Suggestion: Add support for unpenalized logistic regression #6738

Kodiologist commented Apr 30, 2016 •

edited

mblondel commented May 2, 2016

Kodiologist commented May 2, 2016

mblondel commented May 2, 2016

amueller commented Oct 11, 2016

Kodiologist commented Oct 11, 2016

amueller commented Oct 11, 2016

Kodiologist commented Oct 11, 2016 •

edited

amueller commented Oct 13, 2016

alexcombessie commented Feb 9, 2018

jnothman commented Feb 10, 2018 via email

mblondel commented Feb 12, 2018 •

edited

Kodiologist commented Feb 13, 2018

alexcombessie commented Feb 13, 2018

mblondel commented Feb 14, 2018

jnothman commented Feb 14, 2018 via email

arose13 commented Apr 20, 2018

shermstats commented Nov 11, 2018

amueller commented Nov 12, 2018

amueller commented Nov 12, 2018 •

edited

amueller commented Nov 12, 2018 •

edited

Kodiologist commented Nov 12, 2018 •

edited

amueller commented Nov 12, 2018

Kodiologist commented Nov 12, 2018 •

edited

amueller commented Nov 12, 2018

Kodiologist commented Nov 12, 2018

amueller commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 14, 2018

amueller commented Nov 14, 2018

lorentzenchr commented Jan 2, 2019

Suggestion: Add support for unpenalized logistic regression #6738

Suggestion: Add support for unpenalized logistic regression #6738

Comments

Kodiologist commented Apr 30, 2016 • edited

mblondel commented May 2, 2016

Kodiologist commented May 2, 2016

mblondel commented May 2, 2016

amueller commented Oct 11, 2016

Kodiologist commented Oct 11, 2016

amueller commented Oct 11, 2016

Kodiologist commented Oct 11, 2016 • edited

amueller commented Oct 13, 2016

alexcombessie commented Feb 9, 2018

jnothman commented Feb 10, 2018 via email

mblondel commented Feb 12, 2018 • edited

Kodiologist commented Feb 13, 2018

alexcombessie commented Feb 13, 2018

mblondel commented Feb 14, 2018

jnothman commented Feb 14, 2018 via email

arose13 commented Apr 20, 2018

shermstats commented Nov 11, 2018

amueller commented Nov 12, 2018

amueller commented Nov 12, 2018 • edited

amueller commented Nov 12, 2018 • edited

Kodiologist commented Nov 12, 2018 • edited

amueller commented Nov 12, 2018

Kodiologist commented Nov 12, 2018 • edited

amueller commented Nov 12, 2018

Kodiologist commented Nov 12, 2018

amueller commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 13, 2018

amueller commented Nov 13, 2018

shermstats commented Nov 14, 2018

amueller commented Nov 14, 2018

lorentzenchr commented Jan 2, 2019

Kodiologist commented Apr 30, 2016 •

edited

Kodiologist commented Oct 11, 2016 •

edited

mblondel commented Feb 12, 2018 •

edited

amueller commented Nov 12, 2018 •

edited

amueller commented Nov 12, 2018 •

edited

Kodiologist commented Nov 12, 2018 •

edited

Kodiologist commented Nov 12, 2018 •

edited