24

In Pytorch 1.0.0, I found that a tensor variable occupies very small memory. I wonder how it stores so much data. Here's the code.

a = np.random.randn(1, 1, 128, 256)
b = torch.tensor(a, device=torch.device('cpu'))

a_size = sys.getsizeof(a)
b_size = sys.getsizeof(b)

a_size is 262288. b_size is 72.

feedMe
  • 3,431
  • 2
  • 36
  • 61
laridzhang
  • 1,531
  • 3
  • 11
  • 16

2 Answers2

62

The answer is in two parts. From the documentation of sys.getsizeof, firstly

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

so it could be that for tensors __sizeof__ is undefined or defined differently than you would expect - this function is not something you can rely on. Secondly

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

which means that if the torch.Tensor object merely holds a reference to the actual memory, this won't show in sys.getsizeof. This is indeed the case, if you check the size of the underlying storage instead, you will see the expected number

import torch, sys
b = torch.randn(1, 1, 128, 256, dtype=torch.float64)
sys.getsizeof(b)
>> 72
sys.getsizeof(b.storage())
>> 262208

Note: I am setting dtype to float64 explicitly, because that is the default dtype in numpy, whereas torch uses float32 by default.

Jatentaki
  • 11,804
  • 4
  • 41
  • 37
  • 6
    Really great answer, would give another +1 for the pointer towards `storage`. Super helpful to know! – dennlinger Jan 25 '19 at 12:18
  • Thanks, this worked! For the memory of a slice of a tensor, `storage()` will return the memory size of the whole tensor. Thus, we need to clone/copy the slice first. – Chau Pham Jul 16 '22 at 21:31
0

if you want get the size of tensor or network in cuda,you could use this code to calculate it size:

import torch

device = 'cuda:0'

# before
torch._C._cuda_clearCublasWorkspaces()
memory_before = torch.cuda.memory_allocated(device)

# your tensor or network
data5 = torch.randn((10000,100),device=device)

# after
memory_after = torch.cuda.memory_allocated(device)
latent_size = memory_after - memory_before

latent_size
# 4000256

get idea from this:https://github.com/pytorch/pytorch/blob/ee28b865ee9c87cce4db0011987baf8d125cc857/torch/distributed/pipeline/sync/_balance/profile.py#L102

yuanzz
  • 1,359
  • 12
  • 15