Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider support ELLPACK format #359

Open
howsiyu opened this issue May 10, 2024 · 0 comments
Open

Consider support ELLPACK format #359

howsiyu opened this issue May 10, 2024 · 0 comments

Comments

@howsiyu
Copy link

howsiyu commented May 10, 2024

A lot of features matrices in practice have small number of non-zero entries per row. E.g. data that come from one-hot encoding have exactly one non-zero entry per row. These can be handled nicely by CategoricalMatrix if all the non-zero entries are one. However, this is not always the case, e.g. data that comes from sklearn.preprocessing.SplineTransformer. These would be nicely supported by ELLPACK format which is a natural generalization of CategoricalMatrix.

Another option is to support Sliced Ellpack (SELL) format which can support general sparse matrix relatively well and make SplitMatrix consists of just a dense matrix and a SELL matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant