It probably doesn't. Also, it depends on what you call memory leak. In this case, after the program ends all memory should be freed, python has a garbage collector, so it might not happen immediately (your del
or after leaving the scope) like it does in C++ or similar languages with RAII.
del
del
is called by Python and only removes the reference (same as when the object goes out of scope in your function).
torch.nn.Module
does not implement del
, hence its reference is simply removed.
- All of the elements within
torch.nn.Module
have their references removed recursively (so for each CUDA torch.Tensor
instance their __del__
is called).
del
on each tensor is a call to release memory
More about __del__
Caching allocator
Another thing - caching allocator occupies part of the memory so it doesn't have to rival other apps in need of CUDA when you are going to use it.
Also, I assume PyTorch is loaded lazily, hence you get 0
MB used at the very beginning, but AFAIK PyTorch itself, during startup, reserves some part of CUDA memory.
The short story is given here, longer one here in case you didn’t see it already.
Possible experiments
- You may try to run
time.sleep(5)
after your function and measure afterwards.
- You can get snapshot of the allocator state via
torch.cuda.memory_snapshot
to get more info about allocator’s reserved memory and inner workings.
- You might set the environment variable
PYTORCH_NO_CUDA_MEMORY_CACHING=1
and see whether and if anything changes.
Disclaimer
Not a CUDA expert by any means, so someone with more insight could probably expand (and/or correct) my current understanding as I am sure way more things happen under the hood.