4

I am following the 60-minute blitz on PyTorch but have a question about conversion of a numpy array to a tensor. Tutorial example here.

This piece of code:

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

yields

[2. 2. 2. 2. 2.]

tensor([2., 2., 2., 2., 2.], dtype=torch.float64)

However

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
a = a + 1 #the diff is here 
print(a)
print(b)

yields

[2. 2. 2. 2. 2.]

tensor([1., 1., 1., 1., 1.], dtype=torch.float64)

Why are the outputs different?

Community
  • 1
  • 1
syltruong
  • 2,563
  • 20
  • 33

2 Answers2

5

This actually has little to do with PyTorch. Compare

import numpy as np
a = np.ones(5)
b = a

followed by either

np.add(a, 1, out=a)
print(b)

or

a = a + 1
print(b)

There is a difference between np.add(a, 1, out=a) and a = a + 1. In the former you retain the same object (array) a with different values (2 instead of 1); in the latter you get a new array, which is bound to the same variable name a and has values of 2. However, the "original" a is discarded and unless something else (b) points to it, would be deallocated. In other words, the first operation is in-place and the latter out-of-place. Since b holds on to the array originally found at a, reassigning a + 1 to a does not affect the value of b. An alternative in-place mutation syntax would be

a[:] = a + 1
print(b)

Regarding PyTorch, it's very simple. from_numpy creates a tensor which aliases an actual object (array), so it is equivalent to the b = a line in my first snippet. The tensor will track the changes in the array named a at the point of calling, rather than the changes of what the name a points to.

Jatentaki
  • 11,804
  • 4
  • 41
  • 37
  • b = a, followed by id(a), id(b) shows that they are in fact sharing the same memory location. However in the snippet above (b= a.numpy()), if you do id(a) and id(b) it gives different values. How are they sharing the memory location when id gives different values? – Ambareesh Apr 28 '19 at 23:37
  • 1
    @Ambareesh `a.numpy()` creates a tensor which shares the contiguous array holding tensor/array elements, but has different "python wrappers" which keep the strides etc. In other words, `a.numpy()` and `a` are different objects which both contain a pointer to the same memory region. For the PyTorch side of things, you can refer to my answer to [why tensors appear so small](https://stackoverflow.com/questions/54361763/pytorch-why-is-the-memory-occupied-by-the-tensor-variable-so-small/54365012#54365012). – Jatentaki May 04 '19 at 13:09
1

My take on this :

When you do

b = torch.from_numpy(a)

b and a points to the same place in memory. From the doc :

Converting a torch Tensor to a numpy array and vice versa is a breeze. The torch Tensor and numpy array will share their underlying memory locations, and changing one will change the other.

When you do

np.add(a, 1, out=a)

You modify a in place while when you do

a = a+1

You create a new array (named a as well), but that new array doesn't share underlying memory locations with b, so you're not affecting b.

Statistic Dean
  • 4,861
  • 7
  • 22
  • 46