1

This is similarly worded question as ndim in numpy array loaded with scipy.io.loadmat? - but it's actually a lot more basic.

Say I have this structured array:

import sys
import numpy as np
from pprint import pprint

a = np.array([(1.5,2.5),(3.,4.),(1.,3.)],
        dtype=[('x','f4'),('y',np.float32)])

pprint(a)
# array([(1.5, 2.5), (3.0, 4.0), (1.0, 3.0)],
#       dtype=[('x', '<f4'), ('y', '<f4')])

I see that as a table of 3 rows and 2 columns, so 3x2. However, trying to use ndim here, I see:

print(".ndim", a.ndim)
print(".shape", a.shape)
print("asarray.ndim", np.asarray(a).ndim)
# ('.ndim', 1)
# ('.shape', (3,))
# ('asarray.ndim', 1)

... and this is what puzzles me - what makes numpy think this should be a 1d array, when there are clearly fields/columns defined ?!

Given that output, no wonder that reshape doesn't work:

pprint(a.reshape(3,2))
# ValueError: total size of new array must be unchanged

Now, I can bruteforce the structured array into a ("normal" numpy, I guess?) array:

b = np.column_stack((a['x'], a['y']))
pprint(b)
# array([[ 1.5,  2.5],
#        [ 3. ,  4. ],
#        [ 1. ,  3. ]], dtype=float32)

print(".ndim", b.ndim)
print(".shape", b.shape)
print("asarray.ndim", np.asarray(b).ndim)
# ('.ndim', 2)
# ('.shape', (3, 2))
# ('asarray.ndim', 2)

... and so I get the information that I expect.

But I am wondering - why does numpy behave like this with structured arrays - and is there a way to retrieve the 3x2 shape information from the original structured array (a) directly, without "casting" to a "normal" array?

Community
  • 1
  • 1
sdaau
  • 36,975
  • 46
  • 198
  • 278

2 Answers2

1

The dimension of a numpy array is defined independent of the data type, and is defined in a way which is consistent for simple data types such as float64, and more complex user-defined types. Remember that a dtype can be quite a fancy object, and the members can even be of different types. You are expecting numpy to look inside your fancy type, and make the assumption that it is a regular array.

If you want to know how many elements are in you custom type, you can find it for example from len(a.dtype).

DaveP
  • 6,952
  • 1
  • 24
  • 37
1

For a case of structured array of elements of same type, you can use view:

>>> dtype = [('x','f4'),('y','f4')]
>>> a = np.array([(1.5,2.5), (3.,4.), (1.,3.)], dtype=dtype)
>>> a.view('f4').reshape(a.shape[0], -1)
array([[ 1.5,  2.5],
       [ 3. ,  4. ],
       [ 1. ,  3. ]], dtype=float32)

In general case you should be careful, as elements of your record can have different size, and translation can be ambiguous:

>>> dtype = [('x','f4'), ('y','f8')]
>>> a = np.array([(1.5,2.5), (3.,4.), (1.,3.)], dtype=dtype)
>>> a.view('f8')
Traceback (most recent call last):
  ...
ValueError: new type not compatible with array.
>>> a.view('f4').reshape(a.shape[0], -1)
array([[ 1.5   ,  0.    ,  2.0625],
       [ 3.    ,  0.    ,  2.25  ],
       [ 1.    ,  0.    ,  2.125 ]], dtype=float32)

or even like:

>>> a.view('i1').reshape(a.shape[0], -1)
array([[   0,    0,  -64,   63,    0,    0,    0,    0,    0,    0,    4,   64],
       [   0,    0,   64,   64,    0,    0,    0,    0,    0,    0,   16,   64],
       [   0,    0, -128,   63,    0,    0,    0,    0,    0,    0,    8,   64]],  dtype=int8)
alko
  • 46,136
  • 12
  • 94
  • 102