Performance issue with road_usa
when using SSSP algorithm.
#938
Labels
🐲 enhancement
Add or request enhancements to existing functionalities within gunrock.
🍻 help wanted
Extra attention is needed
❓ question
Usage or code base related questions.
Only tested and profiled for
road_usa
. The problem is that the algorithm will take 7-8 seconds on my machine with a GTX 1080, much much slower than the CPU equivalent. When profiled during that range (so, just timingenactor.enact()
), thenvtx
range reports the same time, around 8 seconds. However, the GPU metrics only total to around 800 ms (10x faster).This issue was pointed out by @jdwapman. I think further profiling is required to figure out where in the CPU activity most of this slowdown is coming from (that is if I am understanding the profiled result above correctly). My initial understanding was that there's ~30,000 calls to
cudaLaunchKernel
, and the launch overhead may be the cause. But that should be around 10 us each in the worst case, which is 3 seconds?The text was updated successfully, but these errors were encountered: