1

I'm trying to get a view of 2D ndarray as a record or structured array without copying. This seems to work fine if a owns it data

>>> a = np.array([[  1, 391,  14,  26],
              [ 17, 371,  15,  30],
              [641, 340,   4,   7]])
>>> b = a.view(zip('abcd',[a.dtype]*4))
array([[(1, 391, 14, 26)],
       [(17, 371, 15, 30)],
       [(641, 340, 4, 7)]], 
      dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8'), ('d', '<i8')])
>>> b.base is a
True

But if a is already a view, this fails. Here's an example

>>> b = a[:,[0,2,1,3]]
>>> b.base is None
False
>>> b.view(zip('abcd',[a.dtype]*4))
ValueError: new type not compatible with array.

Interestingly, in this case b.base is a transpose of the view

>>> (b.base == b.T).all()
True

So it makes sense that numpy couldn't create the view of that that I wanted.

However, if I use

>>> b = np.take(a,[0,2,1,3],axis=1)

This results in b being a proper copy of the data so that taking the recarray view works. Side question: Can someone explain this behavior in constrast to fancy indexing?

My question is, am I going about this the wrong way? Is taking a view the way I'm doing it not supported? If so, what would be the proper way to do it?

toes
  • 603
  • 6
  • 13

1 Answers1

1

(big edit)

b is F_CONTINGUOUS (see b.flags). The number of fields in the view then needs to match the number of rows of b, not the number of columns:

In [204]: b=a[:,[0,2,1,3]].view('i4,i4,i4')
In [205]: b
Out[205]: 
array([[(0, 4, 8), (2, 6, 10), (1, 5, 9), (3, 7, 11)]], 
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])

A simpler case is a.copy(order='F').view('i4,i4,i4')

np.take(a,[0,2,1,3],axis=1) and a[:,[0,2,1,3]].copy() produce C_CONTIGUOUS copies, and thus can be viewed with 4 fields.

Note also the b.base has 3 columns.


(earlier stumbling around the issue)

Being a view isn't an issue.

 a = np.arange(12).reshape(3,4)
 a.view('i4,i4,i4,i4')

does just fine.

Making a copy of the first b also works:

 b=a[:,[0,2,1,3]].copy()
 b.view('i4,i4,i4,i4')

The 1st b (without copy) is F_CONTIGUOUS (look at b.flags). That's what your b.base == b.T is showing.

np.take produces the same sort of array as that b copy - i.e. same flags and same __array_interface__ display.

Other things that work:

a[[0,2,1],:].view('i4,i4,i4,i4')
a.T[[0,2,1,3],:].T.view('i4,i4,i4,i4')

If I replace the mixed slicing and array indexing with pure array indexing:

a[[[0],[1],[2]],[0,2,1,3]].view('i4,i4,i4,i4')

the result is C_CONTIGUOUS. So there are details in [:, [...]] that I haven't explained - specifically why it produces an F_CONTIGUOUS copy.

The mixed basic/advanced indexing doc section does warn that memory layout can change:

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing

In the simplest case, there is only a single advanced index. A single advanced index can for example replace a slice and the result array will be the same, however, it is a copy and may have a different memory layout. A slice is preferable when it is possible.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • thanks! I didn't know memory layout could change like that. To be clear, I knew that indexing like `a[:,[0,2,1,3]` will return a copy, but I didn't know that it could be `F_CONTIGOUS` or that `np.take` will always return a `C_CONTIGUOUS` (the numpy doc says that `np.take` does the same thing as fancy indexing, which seems wrong in this instance). How would you handle getting this particular view (without copy) in a way that always works? Is it bad practice to check whether the array is `F\C_CONTIGOUS`? – toes Aug 23 '15 at 01:50