3

Background

Note that np.copy is a shallow copy and will not copy object elements within arrays. This is mainly important for arrays containing Python objects. The new array will contain the same object which may lead to surprises if that object can be modified (is mutable):

Based on 3.2 Memory layout, I suppose array a would be pointing to a memory region that contain the 64 bit float numbers.

Said differently, an array is mostly a contiguous block of memory whose parts can be accessed using an indexing scheme.

enter image description here

Because the document said np.copy is a shallow copy and will not copy object elements within arrays, I supposed the memory region will not be copied but referenced.

However, the copy is not a view to the original array and the memory region/address are different because x[0] == z[0] is False after x[0] = 10```.

x = np.array([1, 2, 3])
z = np.copy(x)
print(f"x.base {x.base} z.base {z.base}")
---
x.base None z.base None

x[0] = 10
print(x[0] == z[0])
---
False

print(f"x.base {x.base} z.base {z.base}")
---
x.base None z.base None

Questions

What is np.copy(a) actually copying?

If it is a shallow copy, why a different memory area is allocated to z[0]? Is it actually copying objects of the array x which should be called deep copy? Or is it copy-on-write?

Or does shallow copy mean, "when an element of np array is a mutable container such as list, then np.copy will not deep-copy the content of the container"?

Please help understand what exactly z = np.copy(x) does and what happen when assignment operations occur to z.

mon
  • 18,789
  • 22
  • 112
  • 205
  • 1
    You don't have an `object` dtype array, so don't worry about `deepcopy` – hpaulj Mar 16 '21 at 01:52
  • 1
    If the dtype is object, the array's databuffer contains references to objects elsewhere in memory. A `copy` will have a new databuffer, but the values will reference the same objects. That's the same problem as with a shallow copy of a nested list. But with a numeric dtype, having a new databuffer is enough. The new buffer has new values. That's the regular array copy and doesn't require over thinking. – hpaulj Mar 16 '21 at 02:14

1 Answers1

3

In python, the copy.copy() method performs a shallow copy, i.e., only objects of a primitive data types (e.g., int, float etc.) are coppyied, but constitute objects (e.g., dictionaries) are not, but instead are referenced trhough pointers.

On the other hand, the copy.deepcopy() method will copy everything (including composite data structures such as dictionaries), and will result in a new object identical to the first.

In case of the numpy module, if you use it for numerical computation as you should (i.e., compute data of one of the numerical dtypes) copy will be equivalent to deepcopy (as all the numeric items of numpy are primitive dtypes), and will result in a new object identical to the first, which you see as a new data space being allocated for the copied item.

Cheers.

Michael
  • 2,167
  • 5
  • 23
  • 38