Update readme comparison chart #113

jasonlaska · 2018-09-09T00:56:16Z

Might want to up our documentation of differences between us and sklearn. Support for randomized lasso has been eliminated from scikit-learn. They think it is too unreliable and think that rescaling the design matrix is equivalent to putting adaptive penalties in the regularizer. But these are not equivalent operations. I suspect the difference is related to sparsity vs. co-sparsity. So this gives our implementation an advantage.

@mnarayan can you provide me with the changes you desire and I'll update

The text was updated successfully, but these errors were encountered:

mnarayan · 2019-02-17T05:00:07Z

There are two major differences now

1. We should highlight that we have a callable empirical covariance. scikit-learn does not support this.

2. scikit-learn discourages use of random weights for sparse support, a feature previously supported in their randomized_l1 linear regression module.

The randomized_l1 module was deprecated from sklearn scikit-learn/scikit-learn#8995.

@deprecated("The class RandomizedLasso is deprecated in 0.19"
            " and will be removed in 0.21.")
class RandomizedLasso(BaseRandomizedLinearModel):
    """Randomized Lasso.

@deprecated("The function lasso_stability_path is deprecated in 0.19"
            " and will be removed in 0.21.")
def lasso_stability_path(X, y, scaling=0.5, random_state=None,
                         n_resampling=200, n_grid=100,
                         sample_fraction=0.75,

See previous implementation here

Thus README chart should no longer say "random lasso is available for the regular lasso". I'm not sure anything else needs to change.

The random lasso by re-weighting predictors is available through an auxiliary package https://github.com/scikit-learn-contrib/stability-selection/blob/master/stability_selection/randomized_lasso.py only.

3. They only permit using a weighted/adaptive regularization by applying the weights to the predictor matrix X. This will not always be equivalent to using a vector or matrix of penalties instead of the scalar in the l1 term.

The case of specifying weights seems quite unusual. Can you show how that is not equivalent to changing the scaling of the features? The citation proves recovery with the specific initialization scheme, but without that, it seems to me that this is exactly the same as rescaling the data.
scikit-learn/scikit-learn#6093 (comment)

But we know from Miki Elad's paper these are not equivalent problems. Former is weighted regularization in the analysis space and latter is weighted regularization in synthesis space.

Elad, M., Milanfar, P. and Rubinstein, R. [2007], ‘Analysis versus synthesis in signal priors’,
Inverse problems 23(3), 947–968.

The analysis version is also goes by Generalized Lasso

Tibshirani, Ryan J., and Jonathan Taylor. "The solution path of the generalized lasso." The Annals of Statistics 39.3 (2011): 1335-1371.

Our README is still correct that their graphical_lasso still does not support adaptivity, unless one gives their algorithm re-weighted columns of the data matrix X. The latter would be able to mimic adaptivity for nodewise/neighborhood selection (i.e. use lasso one node against all other variables) but isn't equivalent to the weighted graphical lasso formulation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update readme comparison chart #113

Update readme comparison chart #113

jasonlaska commented Sep 9, 2018

mnarayan commented Feb 17, 2019 •

edited

Update readme comparison chart #113

Update readme comparison chart #113

Comments

jasonlaska commented Sep 9, 2018

mnarayan commented Feb 17, 2019 • edited

mnarayan commented Feb 17, 2019 •

edited