Specify Rapids Stages for Spark #7577

mtsol · 2023-01-25T07:03:33Z

mtsol
Jan 25, 2023

Suppose I have certain number of stages in spark pipeline, how can I ensure that those stages will run on gpus and the other ones will not go there. This can help me in fine tune the data shifting from cpu to gpu only when I need to use the gpus.

mtsol · 2023-01-25T07:05:26Z

mtsol
Jan 25, 2023
Author

Further, If I implement certain functions as spark native methods when are optimized through spark, can those be taken to gpus for computation for further performance enhancements?

0 replies

revans2 · 2023-01-25T14:51:49Z

revans2
Jan 25, 2023
Maintainer

We have ways to enable and disable classes of operations to go on the GPU or not, but we don't have ways to enable/disable for specific stages in a plan. The stages of a plan show up mostly in the physical planning stages, and with AQE enabled those stages can even change as the plan is running. Because of that there is no good way to address or tag a specific stage of processing. We don't even have a good way to tag a specific operation to be on the GPU or not.

I would love to have the ability to do what you are asking, but spark just does not currently have the tools to do this.

If you could give some specific examples of what you are trying to do or understand we can work with you on getting your questions answered.

Also I don't know what you mean by "spark native method". Are you referring to processing using an RDD directly? Are you talking about using dataset APIs instead of dataframe APIs? Currently we only support dataframe. Dataset APIs usually involve a lot of reflection and require some form of introspection into the JVM byte code to ever hope of supporting it. We have thought about ways to try and support it, but it is rather complicated and has just not been the highest priority.

6 replies

revans2 Jan 26, 2023
Maintainer

The problem is really that the plugin is doing stupid things in many cases.

#7057

Ideally we could do a cost estimate for moving the data, putting it on the GPU, and then moving the data back. But the problem with this is that we need some kind of an estimate for computation speedup vs the CPU. We don't have a good way to deal with that right now. We tried a cost based estimator, but it didn't work and we have not spent much time to debug what happened with it.

Ideally to fix your problem we would fix the cost based optimizer code and then your could configure a speedup for your UDFs and we could make a good guess at if we should put some stages on the GPU or not. Currently I am sorry but we don't have a good solution for this.

mtsol Jan 31, 2023
Author

Can you point out the code in rapids where cost based estimator is implemented, so I can maybe help in contributing.

revans2 Jan 31, 2023
Maintainer

https://github.com/NVIDIA/spark-rapids/blob/branch-23.02/sql-plugin/src/main/scala/com/nvidia/spark/rapids/CostBasedOptimizer.scala is the code for it. The tests are at https://github.com/NVIDIA/spark-rapids/blob/branch-23.02/sql-plugin/src/main/scala/com/nvidia/spark/rapids/CostBasedOptimizer.scala

Feel free to blow the entire thing up and start over if you want. I would suggest that you try to tackle something simple first. Like see if you can get the code to stop doing some very stupid things like what is described in #7057

mtsol Feb 3, 2023
Author

Do you have any method to see which stages are consuming how much percentage of a gpu. Like for a specific class I could see who much gpu is used and I can figure out how fast the code has executed.

Any help will be appreciated.

revans2 Feb 7, 2023
Maintainer

Sorry about the slow reply. We don't have a great way to do this. We typically will do profiling using nsys to see in detail what is happening with a given query. But that really only works on a single worker and is simplest to do in local mode. We also maintain a number of metrics that we use to see how long individual stages of the plan are taking, but that is not great because if there are multiple tasks on the GPU at the same time, then there is no distinction between the kernels for that part taking all the time and the kernel was blocked waiting on another kernel to finish. That is often where we pull in nsys to look more deeply. We will often also restrict the parallelism of the tasks on the GPU to 1 so that we can eliminate some of this when doing profiling.

https://github.com/NVIDIA/spark-rapids/blob/6bc1c052d7b496d71c87f9273d71e79b97ff00bb/docs/dev/nvtx_profiling.md

There are also monitoring tools that will show GPU utilization, but they do not have a way to trace it back to exactly which part of a query is running at a given point in time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify Rapids Stages for Spark #7577

{{title}}

Replies: 2 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Specify Rapids Stages for Spark #7577

mtsol Jan 25, 2023

Replies: 2 comments · 6 replies

mtsol Jan 25, 2023 Author

revans2 Jan 25, 2023 Maintainer

revans2 Jan 26, 2023 Maintainer

mtsol Jan 31, 2023 Author

revans2 Jan 31, 2023 Maintainer

mtsol Feb 3, 2023 Author

revans2 Feb 7, 2023 Maintainer

mtsol
Jan 25, 2023

Replies: 2 comments 6 replies

mtsol
Jan 25, 2023
Author

revans2
Jan 25, 2023
Maintainer

revans2 Jan 26, 2023
Maintainer

mtsol Jan 31, 2023
Author

revans2 Jan 31, 2023
Maintainer

mtsol Feb 3, 2023
Author

revans2 Feb 7, 2023
Maintainer