5

How can one trace the memory allocation for the autograd graph created by the forward pass on cpu? For instance trying to use tracemalloc on cpu:

rnn=nn.RNNCell(100,100).to('cuda')
x=torch.ones((1000,100),device='cuda')
tracemalloc.start(25)
while True:
   print(tracemalloc.get_traced_memory())
   x=rnn(x)

The printed memory should continually increase as the graph is increasing in each loop step, but the printed memory from

tracemalloc.get_traced_memory()

remains constant after the 3rd loop. What is going on?

Horse
  • 361
  • 2
  • 6

0 Answers0