Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #115

Open
zzy221127 opened this issue Nov 28, 2022 · 1 comment
Open

CUDA out of memory #115

zzy221127 opened this issue Nov 28, 2022 · 1 comment

Comments

@zzy221127
Copy link

Dear author:

I run Fastfold in a 4 GPU device, each GPU have an 24GiB memory。

I run inference.py with an fasta lenght 1805AA (without triton), with parameter --gpus 3

and the error prints like:

RuntimeError: CUDA out of memory. Tried to allocate 29.26 GiB (GPU 0; 23.70 GiB total capacity; 9.63 GiB already allocated; 11.79 GiB free; 10.65 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

my questions is:

  1. why there are only one GPU(GPU0 but not GPU0, GPU1,GPU2) used to calculate total memory? what should I do to get over this?

  2. Is there a way to run an extremely long fasta files, like 4000AA?

appriciate your reply, thankyou.

@Shenggan
Copy link
Contributor

I think if you can check args.gpus in the code. It suppose to be 3 if you add parameter correctly.

Alphafold's embedding presentations take up a lot of memory as the sequence length increases. To reduce memory usage, you should add parameter --chunk_size [N] and --inplace to cmdline or shell script ./inference.sh. The smaller you set N, the less memory will be used, but it will affect the speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants