Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally represent monetdb str column as fixed with numpy string array #165

Open
gijzelaerr opened this issue Dec 29, 2021 · 0 comments
Open
Milestone

Comments

@gijzelaerr
Copy link
Collaborator

what we can also think of having an option to format monetdb string columns as numpy fixed-width columns in the result. I know database people find this ugly, but this is just how numpy works and what makes it fast (in the context of python). A common use case in data science panda land is that you have a short string column, containing a dataset label for example. Fixed width strings then suddenly start to make sense again. The issue now is that if you have a big table with int and floats and you call a .fetchdf() monetdbe is fast, but as soon as one of the columns is a (short) string, performance plummets, since it falls back to the python processing mode (with a warning). We could make this a configurable option for .fetchdf() and .fetchnumpy(), where you indicate if you need speed or are memory limited.

@gijzelaerr gijzelaerr added this to the 0.12 milestone Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant