1

I must be making some sort of really trivial mistake, but I'm trying to create a structured array with names for a single axis, e.g., I have an array data with shape (2, 3, 4), and I want to name the first axis such that I can access data['a'] and data['b'] in an both cases get (3, 4) shaped slices. I tried:

shape = (2, 3, 4)
data = np.arange(np.product(shape)).reshape(shape)

dtype = [(nn, float) for nn in ['a', 'b']]
data = np.array(data, dtype=dtype)

But this seems to duplicates all of the data into both 'a' and 'b', e.g.

print(data.shape)
print(data['a'].shape)
> (2, 3, 4)
> (2, 3, 4)

I tried specifying that the shape (in the dtype specification) should be (3, 4) but that duplicated the data 12 more times... and I tried changing the axes order to (3, 4, 2), but that doesn't do anything. Any help appreciated!

martineau
  • 119,623
  • 25
  • 170
  • 301
DilithiumMatrix
  • 17,795
  • 22
  • 77
  • 119
  • If you `print (data)` and `print(data['a'])`, you will get your answer. I didn't downvote. – Sheldore Jan 28 '19 at 17:33
  • @Bazingaa, thanks, I see that `data` has effectively been reshaped into `(2, 3, 4, 2)`, where the last axis is named 'a', 'b'... but I'm not sure why or how to stop doing that. – DilithiumMatrix Jan 28 '19 at 17:45
  • @DilithiumMatrix The form `(str, float)` for dtypes adds a *field name* to the data type. That has nothing to do with array axes. An array stores instances of a certain data type, regardless of the dtype having field names or not. If you want labelled axes, take a look at [xarray](https://pypi.org/project/xarray/) (or maybe [pandas](https://pypi.org/project/pandas/)). – a_guest Jan 28 '19 at 19:30
  • 2
    Make a `np.zeros((3,4), dtype)` array, and copy from `data` to the 2 fields. – hpaulj Jan 28 '19 at 19:32

1 Answers1

2
In [263]: shape = (2, 3, 4)
     ...: data = np.arange(np.product(shape)).reshape(shape)
     ...: 
     ...: dtype = [(nn, float) for nn in ['a', 'b']]

While it may be possible to transform data, the surer approach is to make the desired target array, and copy values to it:

In [264]: res = np.zeros(shape[1:], dtype)
In [265]: res['a'] = data[0]
In [266]: res['b'] = data[1]
In [267]: res
Out[267]: 
array([[( 0., 12.), ( 1., 13.), ( 2., 14.), ( 3., 15.)],
       [( 4., 16.), ( 5., 17.), ( 6., 18.), ( 7., 19.)],
       [( 8., 20.), ( 9., 21.), (10., 22.), (11., 23.)]],
      dtype=[('a', '<f8'), ('b', '<f8')])
In [268]: res['a'].shape
Out[268]: (3, 4)

In this structured array a record consists of 2 floats, and databuffer, contains:

In [272]: res.view(float).ravel()
Out[272]: 
array([ 0., 12.,  1., 13.,  2., 14.,  3., 15.,  4., 16.,  5., 17.,  6.,
       18.,  7., 19.,  8., 20.,  9., 21., 10., 22., 11., 23.])

This is different from the data, [0,1,2,3,...]. So there isn't any sort of reshape or view or astype that will convert one to the other.

So there is a simple mapping from the structured array to a (3,4,2) array, but not your source.

In [273]: res.view(float).reshape(3,4,2)
Out[273]: 
array([[[ 0., 12.],
        [ 1., 13.],
        [ 2., 14.],
        [ 3., 15.]],

       [[ 4., 16.],
        [ 5., 17.],
        [ 6., 18.],
        [ 7., 19.]],

       [[ 8., 20.],
        [ 9., 21.],
        [10., 22.],
        [11., 23.]]])
hpaulj
  • 221,503
  • 14
  • 230
  • 353