While attempting to run a stable diffusion model through InvokeAI (A GUI similar to Automatic1111), I at seemingly random points while performing GPU-intensive operations, such as image generation or loading another model, recieve the following PyTorch/CUDA error:
>> Model change requested: stable-diffusion-1.5
>> Current VRAM usage: 0.00G
>> Offloading custom-elysium-anime-v2 to CPU
>> Scanning Model: stable-diffusion-1.5
>> Model Scanned. OK!!
>> Loading stable-diffusion-1.5 from G:\invokeAI\bullshit\models\ldm\stable-diffusion-v1\v1-5-pruned-emaonly.ckpt
** model stable-diffusion-1.5 could not be loaded: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.
Traceback (most recent call last):
File "g:\invokeai\ldm\invoke\model_cache.py", line 80, in get_model
requested_model, width, height, hash = self._load_model(model_name)
File "g:\invokeai\ldm\invoke\model_cache.py", line 228, in _load_model
sd = torch.load(io.BytesIO(weight_bytes), map_location='cpu')
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1049, in _load
result = unpickler.load()
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1019, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 997, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch._UntypedStorage).storage()._untyped()
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.
** restoring custom-elysium-anime-v2
>> Retrieving model custom-elysium-anime-v2 from system RAM cache
Traceback (most recent call last):
File "g:\invokeai\ldm\invoke\model_cache.py", line 80, in get_model
requested_model, width, height, hash = self._load_model(model_name)
File "g:\invokeai\ldm\invoke\model_cache.py", line 228, in _load_model
sd = torch.load(io.BytesIO(weight_bytes), map_location='cpu')
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1049, in _load
result = unpickler.load()
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 1019, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\serialization.py", line 997, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch._UntypedStorage).storage()._untyped()
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 13107200 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "g:\invokeai\backend\invoke_ai_web_server.py", line 301, in handle_set_model
model = self.generate.set_model(model_name)
File "g:\invokeai\ldm\generate.py", line 843, in set_model
model_data = cache.get_model(model_name)
File "g:\invokeai\ldm\invoke\model_cache.py", line 93, in get_model
self.get_model(self.current_model)
File "g:\invokeai\ldm\invoke\model_cache.py", line 73, in get_model
self.models[model_name]['model'] = self._model_from_cpu(requested_model)
File "g:\invokeai\ldm\invoke\model_cache.py", line 371, in _model_from_cpu
model.to(self.device)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py", line 113, in to
return super().to(*args, **kwargs)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 927, in to
return self._apply(convert)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
param_applied = fn(param)
File "G:\invokeAI\installer_files\env\envs\invokeai\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 802.50 KiB already allocated; 6.10 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
With the attention-catching line being:
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 802.50 KiB already allocated; 6.10 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Normally I'd expect this to simply be due to running out of VRAM, as the message suggests, however, it is attempting to allocate 20 MiB while reporting 6.1 GiB free. That's why I don't know why the error message appears.
This is running on a RTX 2080 with version 527.56 drivers.
I've been unable to find any solutions I could apply to my case. I expect it to successfully allocate the memory as vastly more is free than currently allocated, however instead this error occurs.