Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for dropping collinear variables #16376

Closed
divyaprabha123 opened this issue Feb 3, 2020 · 3 comments
Closed

Add support for dropping collinear variables #16376

divyaprabha123 opened this issue Feb 3, 2020 · 3 comments

Comments

@divyaprabha123
Copy link
Contributor

Describe the workflow you want to enable

Can we add a feature in LinearRegression that could remove collinearity (exact collinearity) in the data?.

Describe your proposed solution

My proposal is to add an extra argument like remove_collinearity if it is set by the user then we can remove exact collinear variables using the rank of the matrix or collinear variables using VIF. This can save some time instead of going for Ridge regression.

@rth
Copy link
Member

rth commented Feb 3, 2020

It might be better to have this as a prepreprocessor in sklearn.feature_selection, that way it could be applied to multiple estimators. I'm not sure that exact collinearity is a frequent issue though. Maybe an estimator with a user defined feature correlation threshold?

I'm not sure if it's something that is often done, as opposed to say feature clustering? The latter can be done in scikit-learn with cluster.FeatureAgglomeration though maybe the interface with a required n_clusters is not ideal.

cc @glemaitre

@thomasjpfan
Copy link
Member

This is being worked on as a feature selection transformer here: #14698

@rth
Copy link
Member

rth commented Feb 3, 2020

Indeed thanks. Closing this issue as a duplicate of #13405 then. If you have other comments or suggestions @divyaprabha123 please comment there.

@rth rth closed this as completed Feb 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants