[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

mcbexx · 2023-08-07T02:44:33Z

Is there an existing issue for this?

I have searched the existing issues

OS

Windows

GPU

cuda

VRAM

8 GB

What version did you experience this issue on?

3.0.1 hotfix 3

What happened?

*** Good old "turn it off and on again" (relaunching) seems to temporarily fix the issue, but you maybe want to look into it anyways for a possible memory issue ***

When trying to use any of the x4 upscalers on a 640x960 image (after having run a couple txt2img and img2img), I am getting a CUDA out of memory error, to my uninitiated eye it looks like PyTorch is hogging 5.3 GiB VRAM for no discernible reason. It just likes to eat VRAM I guess 😆.
I am using xformers (relevant for upscaling?) and the "Free memory after each image generation", so I'm not quite sure why I'm running into memory limitations.

This also happens when trying to upscale the 960x640 image x2 twice.
Never had this issue when upscaling to 4k images with 2.5.3.

[2023-08-07 04:36:04,131]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "N:\InvokeAI_3.venv\lib\site-packages\invokeai\app\services\processor.py", line 86, in __process
outputs = invocation.invoke(
File "N:\InvokeAI_3.venv\lib\site-packages\invokeai\app\invocations\upscale.py", line 103, in invoke
upscaled_image, img_mode = upsampler.enhance(cv_image)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\realesrgan\utils.py", line 223, in enhance
self.process()
File "N:\InvokeAI_3.venv\lib\site-packages\realesrgan\utils.py", line 115, in process
self.output = self.model(self.img)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\basicsr\archs\rrdbnet_arch.py", line 117, in forward
feat = self.lrelu(self.conv_up2(F.interpolate(feat, scale_factor=2, mode='nearest')))
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "N:\InvokeAI_3.venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.41 GiB (GPU 0; 8.00 GiB total capacity; 5.56 GiB already allocated; 280.02 MiB free; 5.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

[2023-08-07 04:36:04,135]::[InvokeAI]::ERROR --> Error while invoking:
CUDA out of memory. Tried to allocate 2.41 GiB (GPU 0; 8.00 GiB total capacity; 5.56 GiB already allocated; 280.02 MiB free; 5.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Screenshots

No response

Additional context

No response

Contact Details

No response

Harvester62 · 2024-03-27T14:43:33Z

Is this bug report still relevant, being currently close to the release v4.0.0? Has the user at least updated and tested it on v3.7?

psychedelicious · 2024-04-27T10:14:21Z

I believe the root issue is the lack of proper RAM and VRAM management of these models. #6132 will resolve that.

lstein · 2024-04-28T17:41:12Z

#6132 does seem to fix the issue. I've done a little testing with this PR, and indeed the VRAM used by the upscaler model is now released either when upscaling is done, if lazy_offload is False, or when the next model needs some space, when lazy_offload is True.

mcbexx added the bug Something isn't working label Aug 7, 2023

Millu added the hardware Issues related to specific hardware/system resources label Aug 23, 2023

psychedelicious linked a pull request Apr 27, 2024 that will close this issue

Add simplified model manager install API to InvocationContext #6132

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

mcbexx commented Aug 7, 2023

Harvester62 commented Mar 27, 2024

psychedelicious commented Apr 27, 2024

lstein commented Apr 28, 2024

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

[bug]: CUDA out of memory error when upscaling x4 (or x2 twice) #4184

Comments

mcbexx commented Aug 7, 2023

Is there an existing issue for this?

OS

GPU

VRAM

What version did you experience this issue on?

What happened?

Screenshots

Additional context

Contact Details

Harvester62 commented Mar 27, 2024

psychedelicious commented Apr 27, 2024

lstein commented Apr 28, 2024