Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

Open
1 task done
mcbexx opened this issue Aug 7, 2023 · 3 comments · May be fixed by #6132
Open
1 task done

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

mcbexx opened this issue Aug 7, 2023 · 3 comments · May be fixed by #6132
Labels
bug Something isn't working hardware Issues related to specific hardware/system resources

Comments

@mcbexx
Copy link

mcbexx commented Aug 7, 2023

Is there an existing issue for this?

  • I have searched the existing issues

OS

Windows

GPU

cuda

VRAM

8 GB

What version did you experience this issue on?

3.0.1 hotfix 3

What happened?

*** Good old "turn it off and on again" (relaunching) seems to temporarily fix the issue, but you maybe want to look into it anyways for a possible memory issue ***

When trying to use any of the x4 upscalers on a 640x960 image (after having run a couple txt2img and img2img), I am getting a CUDA out of memory error, to my uninitiated eye it looks like PyTorch is hogging 5.3 GiB VRAM for no discernible reason. It just likes to eat VRAM I guess 😆.
I am using xformers (relevant for upscaling?) and the "Free memory after each image generation", so I'm not quite sure why I'm running into memory limitations.

This also happens when trying to upscale the 960x640 image x2 twice.
Never had this issue when upscaling to 4k images with 2.5.3.

[2023-08-07 04:36:04,131]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "N:\InvokeAI_3.venv\lib\site-packages\invokeai\app\services\processor.py", line 86, in __process
outputs = invocation.invoke(
File "N:\InvokeAI_3.venv\lib\site-packages\invokeai\app\invocations\upscale.py", line 103, in invoke
upscaled_image, img_mode = upsampler.enhance(cv_image)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\realesrgan\utils.py", line 223, in enhance
self.process()
File "N:\InvokeAI_3.venv\lib\site-packages\realesrgan\utils.py", line 115, in process
self.output = self.model(self.img)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\basicsr\archs\rrdbnet_arch.py", line 117, in forward
feat = self.lrelu(self.conv_up2(F.interpolate(feat, scale_factor=2, mode='nearest')))
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.41 GiB (GPU 0; 8.00 GiB total capacity; 5.56 GiB already allocated; 280.02 MiB free; 5.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

[2023-08-07 04:36:04,135]::[InvokeAI]::ERROR --> Error while invoking:
CUDA out of memory. Tried to allocate 2.41 GiB (GPU 0; 8.00 GiB total capacity; 5.56 GiB already allocated; 280.02 MiB free; 5.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Screenshots

No response

Additional context

No response

Contact Details

No response

@mcbexx mcbexx added the bug Something isn't working label Aug 7, 2023
@Millu Millu added the hardware Issues related to specific hardware/system resources label Aug 23, 2023
@Harvester62
Copy link
Contributor

Is this bug report still relevant, being currently close to the release v4.0.0? Has the user at least updated and tested it on v3.7?

@psychedelicious
Copy link
Collaborator

I believe the root issue is the lack of proper RAM and VRAM management of these models. #6132 will resolve that.

@psychedelicious psychedelicious linked a pull request Apr 27, 2024 that will close this issue
3 tasks
@lstein
Copy link
Collaborator

lstein commented Apr 28, 2024

#6132 does seem to fix the issue. I've done a little testing with this PR, and indeed the VRAM used by the upscaler model is now released either when upscaling is done, if lazy_offload is False, or when the next model needs some space, when lazy_offload is True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hardware Issues related to specific hardware/system resources
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants