Passing structured array to Cython, failed (I think it is a Cython bug)

Question

Suppose I have

a = np.zeros(2, dtype=[('a', np.int),  ('b', np.float, 2)])
a[0] = (2,[3,4])
a[1] = (6,[7,8])

then I define the same Cython structure

import numpy as np
cimport numpy as np

cdef packed struct mystruct:
  np.int_t a
  np.float_t b[2]

def test_mystruct(mystruct[:] x):
  cdef:
    int k
    mystruct y

  for k in range(2):
    y = x[k]
    print y.a
    print y.b[0]
    print y.b[1]

after this, I run

test_mystruct(a)

and I got error:

ValueError                                Traceback (most recent call last)
<ipython-input-231-df126299aef1> in <module>()
----> 1 test_mystruct(a)
_cython_magic_5119cecbaf7ff37e311b745d2b39dc32.pyx in _cython_magic_5119cecbaf7ff37e311b745d2b39dc32.test_mystruct (/auto/users/pwang/.cache/ipython/cython/_cython_magic_5119cecbaf7ff37e311b745d2b39dc32.c:1364)()
ValueError: Expected 1 dimension(s), got 1

My question is how to fix it? Thank you.

Sorry, I edited the question, could you please remove duplicate flag @CristiánAntuña . Thank you. — SDE_Amazon, May 04 '15 at 18:42
@hpaulj: it was a different question at first. Pengyu: comment removed. — Cristián Antuña, May 04 '15 at 18:46
http://stackoverflow.com/questions/9423207/accessing-numpy-record-array-columns-in-cython successfully passes a structured array to C packed structure. And this finds a way around the same error: http://stackoverflow.com/questions/17239091/cython-memoryviews-from-array-of-structs — hpaulj, May 04 '15 at 20:26
@hpaulj, the problem is that if there is an array definition inside C structure, it will cause an error. I don't see how to get around this error in two posts. — SDE_Amazon, May 04 '15 at 21:24
In `tests/memoryview/numpy_memoryview.pyx`. `def test_structarray_errors(StructArray[:] a):` tests a packed struct with `int a[4]`. — hpaulj, May 05 '15 at 02:57

hpaulj · Answer 1 · 2017-10-02T03:40:05.737

This pyx compiles and imports ok:

import numpy as np
cimport numpy as np

cdef packed struct mystruct:
  int a[2]    # change from plain int
  float b[2]
  int c

def test_mystruct(mystruct[:] x):
  cdef:
    int k
    mystruct y

  for k in range(2):
    y = x[k]
    print y.a
    print y.b[0]
    print y.b[1]

dt='2i,2f,i'
b=np.zeros((3,),dtype=dt)
test_mystruct(b)

I started with the test example mentioned in my comment, and played with your case. I think the key change was to define the first element of the packed structure to be int a[2]. So if any element is an array, the first must an array to properly set up the structure.

Clearly an error that the test file isn't catching.

Defining the element as int a[1] doesn't work, possibly because the dtype removes such a dimension:

In [47]: np.dtype([('a', np.int, 1),  ('b', np.float, 2)])
Out[47]: dtype([('a', '<i4'), ('b', '<f8', (2,))])

Defining the dtype to get around this shouldn't be hard until the issue is raised and patched.

The struct could have a[1], but the array dtype would have to specify the size with a tuple: ('a','i',(1,)). ('a','i',1) would make the size ().

If one of the struct arrays is 2d, it looks like all of them have to be:

cdef packed struct mystruct:
  int a[1][1]
  float b[2][1]
  int c[2][2]

https://github.com/cython/cython/blob/c4c2e3d8bd760386b26dbd6cffbd4e30ba0a7d13/tests/memoryview/numpy_memoryview.pyx

Stepping back a bit, I wonder what's the point to processing a complex structured array in cython. For some operations wouldn't it work just as well to pass the fields as separate variables. For example myfunc(a['a'],a['b']) instead of myfunc(a).

score 1 · Answer 2 · answered Aug 08 '17 at 08:02

There is a general method to get the dtype for a c struct, but it involves a temporary variable:

cdef mystruct _tmp
dt = np.asarray(<mystruct[:1]>(&_tmp)).dtype

This requires at least numpy 1.5. See discussion here: https://github.com/scikit-learn/scikit-learn/pull/2298

Passing structured array to Cython, failed (I think it is a Cython bug)

2 Answers2

Linked