numpy recarray copy retains dtype reference?

Question

I am trying to copy a recarray and change the names of the fields/records in the new array. However, this modifies the names of the original array (the values are not unlinked, however). Example:

import numpy as np
import copy

define original array

arr = np.array(np.random.random((3,2)),
               dtype=[('a','float'),('b','float')])

first copy

arr2 = arr.copy()
arr2.dtype.names = ('c','d')
arr.dtype.names
--> ('c','d')

second copy

arr3 = copy.deepcopy(arr2)
arr2.dtype.names = ('e','f')
arr.dtype.names
--> ('e','f')

Why does this happen and how to keep this from happening? I suspect the dtype is a separate list/object whose reference is copied upon copy(), but even if I assign a deep copy of the dtype object to the original array, I get the same result:

dt = copy.deepcopy(arr.dtype)
arr.dtype = dt

arr3.dtype.names = ('g','h')
arr.dtype.names
--> ('g','h')

score 1 · Accepted Answer · answered Oct 29 '11 at 00:17

I intepret your Q that you want to have arr3 to have its own dtype, so that you can modify it without affecting dtype of original one. if so, you can

arr.dtype 
# --> dtype([('a', '<f8'), ('b', '<f8')])
dt3 = copy.deepcopy(arr.dtype)
dt3.names = ('g','h')
arr3 = np.array(arr, dtype=dt3)
arr.dtype 
# --> dtype([('a', '<f8'), ('b', '<f8')])

Trick seems that I have to have different dtype when I create arr3 (changed dt3, then create arr3). otherwise, ndarray grabs pre-existing dtype (this seems some kind of proxy).

Actually I struggled with similar problem earlier without finding it. I wanted modify part of dtype then, but didnt know how so i ended up hard-wired entire definition again for this second dtype (one field of mine was sub-array and i know its shape only at runtime). So this was good Q for me :)

numpy recarray copy retains dtype reference?

1 Answers1