Skip to content

How do I handle "large" datasets in spark-rapids-ml benchmarks #10567

Answered by revans2
an-ys asked this question in Q&A
Discussion options

You must be logged in to vote

@an-ys The first error that you got appears to be the out of memory killer kicking in and shooting your process. This is related to running out of host memory, not GPU memory. You also ran out of GPU memory, which is what showed up in your second stack trace. Right now we handle running out of GPU memory much better than CPU memory. On GPU memory we have limits and we end up spilling data or pausing threads to make it work. It is not 100% perfect, but it does work rather well.

For the CPU we are still in the process of making that work. Then plan is to do the same strategies that we do for GPU memory, but it is not done yet. But this is only memory limits on the java side of things, not t…

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@eordentlich
Comment options

@an-ys
Comment options

Answer selected by an-ys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants