Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for sparse feature data #26

Open
ogrisel opened this issue Nov 1, 2018 · 0 comments
Open

Implement support for sparse feature data #26

ogrisel opened this issue Nov 1, 2018 · 0 comments

Comments

@ogrisel
Copy link
Owner

ogrisel commented Nov 1, 2018

For instance if all the data is passed as a scipy.sparse.csc_matrix (e.g. after one hot encoding).

Pandas as support for sparse features: http://pandas.pydata.org/pandas-docs/stable/sparse.html

In particular it has dedicated datastructure for 1D sparse data: SparseArray.

There is also: https://github.com/pydata/sparse and I believe the ecosystem will converge at some point. I would be in favor of leveraging the datastracture from Pandas to start with the most adopted solutions that allows for heterogeneously typed features (a fix of dense and sparse columns, categorical or numerical).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant