Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory issue with BFS #908

Open
YuxinxinChen opened this issue Feb 3, 2022 · 0 comments
Open

Out of memory issue with BFS #908

YuxinxinChen opened this issue Feb 3, 2022 · 0 comments
Labels
🐛 bug Use to report bugs in the issues or fix bugs in a pull request.

Comments

@YuxinxinChen
Copy link

YuxinxinChen commented Feb 3, 2022

I got out of memory issue running BFS on Twitter dataset on branch v0.5.1.

Here is the error message:

jsrun -n 1 -c ALL_CPUS -a 1 -g 1 -r 1 /ccs/home/yuxinc/gunrock/build3/bin/bfs market /gpfs/alpine/bif115/scratch/yuxinc/gunrock_dataset/twitter/twitter.mtx --validation=1 --iteration-num=10 --undirected=true
Loading Matrix-market coordinate-formatted graph ...
Reading directly from stored binary CSR arrays from /gpfs/alpine/bif115/scratch/yuxinc/gunrock_dataset/twitter/.twitter.mtx.ud.0.bin ...
Done reading (2s).

Degree Histogram (51161011 vertices, 1963031615 edges):
Degree 0: 8174256 (15.98%)
Degree 2^0: 8154210 (15.94%)
Degree 2^1: 6378653 (12.47%)
Degree 2^2: 5920723 (11.57%)
Degree 2^3: 5130731 (10.03%)
Degree 2^4: 10452894 (20.43%)
Degree 2^5: 3090018 (6.04%)
Degree 2^6: 1637352 (3.20%)
Degree 2^7: 983612 (1.92%)
Degree 2^8: 608936 (1.19%)
Degree 2^9: 332954 (0.65%)
Degree 2^10: 214665 (0.42%)
Degree 2^11: 60566 (0.12%)
Degree 2^12: 13359 (0.03%)
Degree 2^13: 5234 (0.01%)
Degree 2^14: 2166 (0.00%)
Degree 2^15: 551 (0.00%)
Degree 2^16: 116 (0.00%)
Degree 2^17: 7 (0.00%)
Degree 2^18: 6 (0.00%)
Degree 2^19: 2 (0.00%)

[/ccs/home/yuxinc/gunrock/gunrock/util/array_utils.cuh, 197 @ gpu 0] labels cudaMalloc failed (CUDA error 2: out of memory)
[/ccs/home/yuxinc/gunrock/examples/bfs/test_bfs.cu, 350 @ gpu 0] BFS Problem Init failed (CUDA error 2: out of memory)
Using 1 GPU: [ 0 ].

[BFS] finished.
avg. elapsed: 0.0000 ms
iterations: 0
load time: 7097.5571 ms
preprocess time: 0.0000 ms
postprocess time: 0.0000 ms
total time: 8659.2469 ms

The dataset is accessible from directory: /data/yuxin/3_graph_dataset where twitter.csr is the CSR binary file which can be directly red by Gunrock.

Besides the OOM error on a single GPU, it doesn't have errors running on 2,3,4,5 GPUs but got the same error with 6 GPUs.

Thanks for looking into this!

Best,

Yuxin

@YuxinxinChen YuxinxinChen added the 🐛 bug Use to report bugs in the issues or fix bugs in a pull request. label Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Use to report bugs in the issues or fix bugs in a pull request.
Projects
None yet
Development

No branches or pull requests

1 participant