how to pass numpy array to Cython function correctly?

Question

This is described in many places but i simply cannot get it to work. I am calling a C++ function from Cython:

cimport numpy as np
cdef extern from "test.h" namespace "mytest":
   void test(double *A, int m)

cdef int foo():
  cdef np.ndarray[double,mode="c"] a = np.array([1,2,3,4,5],dtype=float)
  # pass ptr to first element of 'a'
  test(&a[0], len(a))
  return 0

foo()

test.cpp is just:

#include <stdio.h>
namespace mytest {
    void test(double *A, int m)
    {
    for (int i = 0; i < m; i++)
    {
        printf("%d is %f\n", i, A[i]);
    }
    }
}

test.h just has:

namespace mytest {
  void test(double *A, int m);
}

This seems to work but when is np.ascontiguousarray needed? Is it sufficient to do:

cdef np.ndarray[double,mode="c"] a = np.array([1,2,3,4,5],dtype=float)

or do you need:

cdef np.ndarray[double,mode="c"] a = np.ascontiguousarray(np.array([1,2,3,4,5],dtype=float))

second and more importantly, how can this generalize to 2d arrays?

Handling 2d arrays

Here is my attempt at passing 2d numpy arrays to C++ which does not work:

cdef np.ndarray[double,mode="c",ndim=2] a = np.array([[1,2],[3,4]],dtype=float)

which is called as:

test(&a[0,0], a.shape[0], a.shape[1])

in the cpp code:

void test(double *A, int m, int n) 
{ 
  printf("reference 0,0 element\n");
  printf("%f\n", A[0][0]);
}

UPDATE: The correct answer

The correct answer is to use linear indexing for the array and not the [][] syntax. The correct way to print the 2d array is:

for (int i = 0; i < m; i++)
{
    for (int j = 0; j < n; j++)
    {
    printf("%d, %d is %f\n", i, j, A[i*m + j]);
    }
}

related: [Best Practices for passing numpy data pointer to C](https://groups.google.com/forum/#!msg/cython-users/8uuxjB_wbBQ/wqRbsLDPCTsJ) — jfs, Feb 26 '14 at 23:20
@J.F.Sebastian: Thank you I've been reading that thread but it's confusing me more. I am basically trying out the "&arr[0]" method since it makes most sense to me but I haven't seen any examples that work. (I don't want to use ctypes) — , Feb 26 '14 at 23:24
In your two dimensional example, it looks to me like you are dereferencing the pointer `A` twice. For a 2D array you will probably have to do the index arithmetic manually. For example, if you have a C contiguous m x n array and you want to do the C equivalent of NumPy's `A[i,j]' you would have to do `A[m*i+j]` instead of `A[0][0]`. Dereferencing the pointer twice will probably crash Python. — IanH, Feb 26 '14 at 23:33
`m` is the number of rows. The datatype is taken care of when you specify the type for the pointer. I'll throw together a quick example. — IanH, Feb 26 '14 at 23:59
@IanH: I understand now, I updated my answer to have a working example for future users. — , Feb 27 '14 at 00:01
@user248237dfsf: I've liked the suggestion from the thread to use [typed memoryview](http://docs.cython.org/src/userguide/memoryviews.html)s ([`&s[0]` syntax to pass to C](http://stackoverflow.com/q/14584439/4279)). — jfs, Feb 27 '14 at 00:06

score 6 · Answer 1 · answered Feb 26 '14 at 23:16

6

For 2D arrays, you just need the ndim keyword:

cdef np.ndarray[double, mode="c", ndim=2]

The result may or may not share memory with the original. If it shares memory with the original, then the array may not be contiguous, or may have an unusual striding configuration. In this case, passing the buffer to C/C++ directly will be disastrous.

You should always use ascontiguousarray unless your C/C++ code is prepared to deal with non-contiguous data (in which case you will need to pass in all relevant stride data from Cython into the C function). If the input array is already contiguous, no copy will be made. Make sure to pass a compatible dtype to ascontiguousarray so that you don't risk a second copy (e.g. having to convert from a contiguous float array to a contiguous double array).

answered Feb 26 '14 at 23:16

nneonneo

171,345
36
312
383

@nneoneo: I had ndim, but had a typo in original post, sorry. adding ndim=2 is necessary to make the array but I still don't know how to access it from C. How can I access it on the C side? Can you show an example? – Feb 26 '14 at 23:21
Seems like it should work if you added `ascontiguousarray`; what doesn't seem to be working about it? – nneonneo Feb 26 '14 at 23:23
@nneoneo: try passing the 2d array to a C++ function and printing its elements using ``A[i][j]`` - it does not work. Double indexing just gives junk. Do you see what I mean? – Feb 26 '14 at 23:47
Oh, I see the problem. You can't double index like that in C unless you have a `double **` (an array of pointers) or a `double [][]` (a declared 2D array). Otherwise, with just a `double *`, you have to index manually: `A[i*n + j]`. – nneonneo Feb 26 '14 at 23:50
By the way, if you are *writing* the C++ algorithm from scratch, have you considered just writing the whole thing in NumPy? Then you can use double-indexing `A[i,j]` with any array, not just contiguous ones, and Cython handles everything for you. Plus, it's quite fast since most of the operations will happen in pure C. – nneonneo Feb 26 '14 at 23:52
@nneoneo: Could you say how ``double **`` would work here? My array as you say is declared to be 2d on the cython side. Does that mean that my function can have an argument of type ``double **`` and then use ``[][]`` for indexing? Or do I need to do some casting first into an array of pointers? This is what I don't understand from the manuals – Feb 26 '14 at 23:55
@user248237dfsf: You don't have an array of pointers (you have only a flat array of data), so you **cannot** use `double **`. (That would be for cases when you had an indirected array of pointers to the start of each row, for example). Cython will never give you `double **`; you always get `double *` no matter how many dimensions you have. – nneonneo Feb 26 '14 at 23:56
@nneoneo: OK, this was the part that was unclear. I think I get it now. So the conclusion is that you should always use contiguous 'row' indexing for all arrays, regardless of dimensions, when using Cython – Feb 26 '14 at 23:58
2

In C, you have to use 1D indexing, yes. This is typical when dealing with a buffer of contiguous data. (On multidimensional C arrays, like `double [][]`, the C compiler essentially generates the equivalent 1D indexing operation by multiplying the array dimensions with the indices as appropriate). – nneonneo Feb 27 '14 at 00:00

how to pass numpy array to Cython function correctly?

1 Answers1

Linked