New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GraphBolt] ItemSampler CPU usage too high, especially hetero case. #7315
Comments
As @mfbalin mentioned, specific logic for |
MiniBatch.blocks
uses the CPU when GPU sampling.
It now looks like |
dgl/heterograph.py:6407 make_canonical_edges uses numpy for some ops. |
Line 583 in 41a3848
|
@peizhou001 |
MiniBatch.blocks
uses the CPU when GPU sampling.
@mfbalin tried to bypass the whole forward including |
Update: CPU usage high even for the homo examples. Some recent change might have caused us to utilize the CPU even in the pureGPU mode. @frozenbugs do you think it could be the logic to move MiniBatch to device? |
This commit does not have the issue. Somewhere between current master and the reported commit above, there was a change that cause CPU util on the GPU code path. |
Easiest way to test is to run |
Transfered attr list:
Actually transfered by calling
|
Looks like blocks is called inside |
I see. Do you think we need a check when call Minibatch.to()? |
I figured it out. When we filter which attributes to transfer, we end up calling blocks property. Making a quick patch now. |
CPU usage still higher than 100% though, so I am not sure if I resolved the whole issue. |
Even with #7330, we need to investigate where the high CPU usage comes from. CPU usage is 800% for our main pure-gpu ( |
hetero example CPU usage is still 4000% |
@Rhett-Ying Here, we can find see the last iterations of training dataloader for the hetero example. Since we have a prefetcher thread with a buffer size 2, the last 2 iterations don't have excessive CPU utilization as the computation for the last 2 iterations has already finished. This indicates that the high CPU utilization is due to the ItemSampler.
|
Users with multiple GPUs may not be able to utilize the GPUs effectively due to potential CPU bottleneck. |
馃敤Work Item
IMPORTANT:
Project tracker: https://github.com/orgs/dmlc/projects/2
Description
When running the hetero graphbolt example in the pureGPU mode, the CPU utilization is very high. (4000%)
Depending work items or issues
The text was updated successfully, but these errors were encountered: