performance #8

daroczig · 2015-11-11T21:25:39Z

Fetching more than 100 rows seem to be rather slow compared to RJDBC (with the same jar files) or exporting data to CSV via impala-shell, so I suspect there might be some performance issues when converting the results returned by the JDBC driver to R. See some related benchmarks at http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java

The text was updated successfully, but these errors were encountered:

austincv · 2015-11-12T16:11:24Z

Thanks for letting me know. Yes, I agree with you. I think a huge amount of time is spent converting the JDBC result set to R data types. I assume giving an option in the function to return the JDBC result as such and letting the user figure out how to use it will give similar data fetch times as RJDBC. In our use case we didn't really need to transfer huge data sets back to R which was probably why we overlooked this. I am a bit tied up right now. I will look into it and get back to you late next week.
Brilliant blog post by the way :)

daroczig · 2015-11-12T18:33:41Z

Thank you very much, @austincv, and keep up the good work!

ankurmitujjain · 2016-05-05T07:05:01Z

Hello Team,

Any update on this?

Or please let me know direction, I can push pull request for this....

Thanks
Ankur

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance #8

performance #8

daroczig commented Nov 11, 2015

austincv commented Nov 12, 2015

daroczig commented Nov 12, 2015

ankurmitujjain commented May 5, 2016

performance #8

performance #8

Comments

daroczig commented Nov 11, 2015

austincv commented Nov 12, 2015

daroczig commented Nov 12, 2015

ankurmitujjain commented May 5, 2016