PyTorch error: "RuntimeError: CUDA out of memory." occuring yet reporting more than enough memory free

Question

While attempting to run a stable diffusion model through InvokeAI (A GUI similar to Automatic1111), I at seemingly random points while performing GPU-intensive operations, such as image generation or loading another model, recieve the following PyTorch/CUDA error:

>> Model change requested: stable-diffusion-1.5
>> Current VRAM usage:  0.00G
>> Offloading custom-elysium-anime-v2 to CPU
>> Scanning Model: stable-diffusion-1.5
>> Model Scanned. OK!!
>> Loading stable-diffusion-1.5 from G:\invokeAI\bullshit\models\ldm\stable-diffusion-v1\v1-5-pruned-emaonly.ckpt
** model stable-diffusion-1.5 could not be loaded: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.
Traceback (most recent call last):
  File "g:\invokeai\ldm\invoke\model_cache.py", line 80, in get_model
    requested_model, width, height, hash = self._load_model(model_name)
  File "g:\invokeai\ldm\invoke\model_cache.py", line 228, in _load_model
    sd = torch.load(io.BytesIO(weight_bytes), map_location='cpu')
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 712, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1049, in _load
    result = unpickler.load()
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1019, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 997, in load_tensor
    storage = zip_file.get_storage_from_record(name, numel, torch._UntypedStorage).storage()._untyped()
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.

** restoring custom-elysium-anime-v2
>> Retrieving model custom-elysium-anime-v2 from system RAM cache


Traceback (most recent call last):
  File "g:\invokeai\ldm\invoke\model_cache.py", line 80, in get_model
    requested_model, width, height, hash = self._load_model(model_name)
  File "g:\invokeai\ldm\invoke\model_cache.py", line 228, in _load_model
    sd = torch.load(io.BytesIO(weight_bytes), map_location='cpu')
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 712, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1049, in _load
    result = unpickler.load()
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1019, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 997, in load_tensor
    storage = zip_file.get_storage_from_record(name, numel, torch._UntypedStorage).storage()._untyped()
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "g:\invokeai\backend\invoke_ai_web_server.py", line 301, in handle_set_model
    model = self.generate.set_model(model_name)
  File "g:\invokeai\ldm\generate.py", line 843, in set_model
    model_data = cache.get_model(model_name)
  File "g:\invokeai\ldm\invoke\model_cache.py", line 93, in get_model
    self.get_model(self.current_model)
  File "g:\invokeai\ldm\invoke\model_cache.py", line 73, in get_model
    self.models[model_name]['model'] = self._model_from_cpu(requested_model)
  File "g:\invokeai\ldm\invoke\model_cache.py", line 371, in _model_from_cpu
    model.to(self.device)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py", line 113, in to
    return super().to(*args, **kwargs)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 927, in to
    return self._apply(convert)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
    param_applied = fn(param)
  File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 802.50 KiB already allocated; 6.10 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

With the attention-catching line being:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 802.50 KiB already allocated; 6.10 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Normally I'd expect this to simply be due to running out of VRAM, as the message suggests, however, it is attempting to allocate 20 MiB while reporting 6.1 GiB free. That's why I don't know why the error message appears.

This is running on a RTX 2080 with version 527.56 drivers.

I've been unable to find any solutions I could apply to my case. I expect it to successfully allocate the memory as vastly more is free than currently allocated, however instead this error occurs.

score 0 · Answer 1 · answered Dec 09 '22 at 12:16

0

The line that you highlighted is deriving from another exception that you got before:

RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.

From your error stack trace, the problem is happening during the load of the model checkpoint. Have you, perhaps, installed a pytorch version newer than the one on which they build Invoke AI? Moreover, they declare that you need at least 12 GB of RAM - is it the case?

answered Dec 09 '22 at 12:16

Domenico

126
13

Thanks for the response, Invoke AI runs within a condo environment using its own PyTorch version, I've also never installed PyTorch myself so that shouldn't be the issue. And I have 32GB of ram, of which 20+ (at least) is free when using Invoke AI. The 13107200 bytes are also only 13MB in total. My guess is that the DefaultCPUAllocator is a CPU-based allocator of VRAM and not normal RAM. – Aresiel Dec 09 '22 at 15:01

PyTorch error: "RuntimeError: CUDA out of memory." occuring yet reporting more than enough memory free

1 Answers1