Suppose I have an algorithm that does the following in python with pytorch. Please ignore whether the steps are efficient. This is just a silly toy example.
def foo(input_list):
# input_list is a list of N 2-D pytorch tensors of shape (h,w)
tensor = torch.stack(input_list) # convert to tensor.shape(h,w,N)
tensor1 = torch.transpose(tensor,0,2).unsqueeze(1) # convert to tensor.shape(N,1,h,w)
tensor2 = torch.interpolate(tensor1,size=(500,500) # upsample to new shape (N,1,500,500)
def bar(input_list):
tensor = torch.stack(input_list) # convert to tensor.shape(h,w,N)
tensor = torch.transpose(tensor,0,2).unsqueeze(1) # convert to tensor.shape(N,1,h,w)
tensor = torch.interpolate(tensor,size=(500,500) # upsample to new shape (N,1,500,500)
My question is whether it makes more sense to use method foo()
or bar()
or if it doesn't matter. My thought was that I save memory by rewriting over the same variable name (bar
), since I will never actually need those intermediate steps. But if the CUDA interface is creating new memory spaces for each function, then I'm spending the same amount of memory with both methods.