Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA error at /builds/ofan/ont_core_cpp/ont_core/common/cuda_common.cpp:232: CUDA_ERROR_INVALID_DEVICE #351

Open
Modernism-01 opened this issue Aug 8, 2023 · 0 comments

Comments

@Modernism-01
Copy link

hello,
I followed the instruction and run the megalodon (2.4.1) and guppy_basecall_server (6.0.1+652ffd1) with the GPU support. But it always reported error information. Although there were some similar issues that others have encountered, I can not solve my problem through their discussion. So I have no choice but to ask for additional help. Attached below is my demo in scripts:

megalodon \ ~/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS \ --guppy-params "-d /public/home/zenglingsen/04.software/03.Guppy/rerio/basecall_models/" \ --guppy-server-path /public/home/zenglingsen/04.software/03.Guppy/ont-guppy/bin/guppy_basecall_server \ --guppy-config res_dna_r941_prom_modbases_5mC_CpG_v001.cfg \ --outputs basecalls mappings mod_mappings mods per_read_mods \ --reference /public/home/zenglingsen/01.data/03.Reference/GCF_000003025.6_Sscrofa11.1_genomic.fna \ --mod-motif m CG 0 \ --output-directory /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS/out \ --overwrite \ --devices 0 \ --processes 8 \

the error file showed:

[10:17:00] Running Megalodon version 2.4.1
[10:17:00] Loading guppy basecalling backend


ERROR: Guppy server initialization failed. See guppy logs in [--output-directory] for more details.
	Try running the guppy server initialization command found in log.txt in order to pinpoint the source of this issue.

this is the content in the log.txt:

[10:17:00] Running Megalodon version 2.4.1
DBG 10:17:00 : Command: """/public/home/zenglingsen/04.software/02.Anaconda/Or/envs/pytorch/bin/megalodon /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS --guppy-params -d /public/home/zenglingsen/04.software/03.Guppy/rerio/basecall_models/ --guppy-server-path /public/home/zenglingsen/04.software/03.Guppy/ont-guppy/bin/guppy_basecall_server --guppy-config res_dna_r941_prom_modbases_5mC_CpG_v001.cfg --outputs basecalls mappings mod_mappings mods per_read_mods --reference /public/home/zenglingsen/01.data/03.Reference/GCF_000003025.6_Sscrofa11.1_genomic.fna --mod-motif m CG 0 --output-directory /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS/out --overwrite --devices 1 --processes 8""" --- MainProcess-MainThread megalodon.py:1793
[10:17:00] Loading guppy basecalling backend
DBG 10:17:00 : Guppy version: "6.0.1" --- MainProcess-MainThread backends.py:939
DBG 10:17:00 : Pyguppy version: "6.0.1" --- MainProcess-MainThread backends.py:940
DBG 10:17:00 : guppy server init command: "/public/home/zenglingsen/04.software/03.Guppy/ont-guppy/bin/guppy_basecall_server -p auto -l /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS/out/guppy_log -c res_dna_r941_prom_modbases_5mC_CpG_v001.cfg --quiet --post_out -x cuda:1 -d /public/home/zenglingsen/04.software/03.Guppy/rerio/basecall_models/" --- MainProcess-MainThread backends.py:1018
DBG 10:17:01 : Found guppy log file: /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS/out/guppy_log/guppy_basecall_server_log-2023-08-08_10-17-01.log --- MainProcess-MainThread backends.py:1033


ERROR: Guppy server initialization failed. See guppy logs in [--output-directory] for more details.
	Try running the guppy server initialization command found in log.txt in order to pinpoint the source of this issue.

this is the guppy_server log file:

2023-08-08 10:17:01.473438 [guppy/message] ONT Guppy basecall server software version 6.0.1+652ffd1, client-server API version 10.0.0
log path: /public/home/zenglingsen/01.data/01.ONT_data/02.ONT_test/AW/FAST5_PASS/out/guppy_log
chunk size: 2000
chunks per runner: 512
max queued reads: 2000
num basecallers: 4
num socket threads: 2
max returned events: 50000
gpu device: cuda:1
kernel path:
runners per device: 4
2023-08-08 10:17:01.475945 [guppy/info] crashpad_handler not supported on this platform.
2023-08-08 10:17:01.478127 [guppy/info] Listening on port ipc:///tmp/3763-2667-29cb-f0a3.
2023-08-08 10:17:01.607464 [guppy/error] CUDA error at /builds/ofan/ont_core_cpp/ont_core/common/cuda_common.cpp:232: CUDA_ERROR_INVALID_DEVICE. Error initialising basecall server using port: ipc://auto. Aborting.
2023-08-08 10:17:01.608638 [guppy/message] The basecall server has shut down successfully.

the last one is the GPU information in partition:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:86:00.0 Off | N/A |
| 44% 33C P0 93W / 350W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

I have followed the reasonable steps through discussion like Guppy can't run on GPU with trained model #46, error while running guppy 4.4.1 on GPU mode #6, and so on. But it does not work.

Any advice and comment are greatful. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant