CUDA out of memory #115

zzy221127 · 2022-11-28T15:45:38Z

Dear author：

I run Fastfold in a 4 GPU device， each GPU have an 24GiB memory。

I run inference.py with an fasta lenght 1805AA (without triton), with parameter --gpus 3

and the error prints like:

RuntimeError: CUDA out of memory. Tried to allocate 29.26 GiB (GPU 0; 23.70 GiB total capacity; 9.63 GiB already allocated; 11.79 GiB free; 10.65 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

my questions is:

why there are only one GPU(GPU0 but not GPU0, GPU1,GPU2) used to calculate total memory? what should I do to get over this?
Is there a way to run an extremely long fasta files, like 4000AA?

appriciate your reply, thankyou.

The text was updated successfully, but these errors were encountered:

Shenggan · 2022-11-29T01:31:30Z

I think if you can check args.gpus in the code. It suppose to be 3 if you add parameter correctly.

Alphafold's embedding presentations take up a lot of memory as the sequence length increases. To reduce memory usage, you should add parameter --chunk_size [N] and --inplace to cmdline or shell script ./inference.sh. The smaller you set N, the less memory will be used, but it will affect the speed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory #115

CUDA out of memory #115

zzy221127 commented Nov 28, 2022

Shenggan commented Nov 29, 2022

CUDA out of memory #115

CUDA out of memory #115

Comments

zzy221127 commented Nov 28, 2022

Shenggan commented Nov 29, 2022