How to account for column-contiguous array when extending numpy with C

Question

I have a C-function to normalize the rows of an array in log-space (this prevents numerical underflow).

The prototype of my C-function is as follows:

void normalize_logspace_matrix(size_t nrow, size_t ncol, double* mat);

You can see that it takes a pointer to an array and modifies it in place. The C-code of course assumes the data is saved as a C-contiguous array, i.e. row-contiguous.

I wrap the function as follows using Cython (imports and cdef extern from omitted):

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d
    n = mat.shape[0]
    d = mat.shape[1]
    normalize_logspace_matrix(n, d, <double*> mat.data)
    return mat

Most of the time numpy-arrays are row-contiguous and the function works fine. However, if a numpy-array has been previously transposed the data is not copied around but just a new view into the data is returned. In this case my function fails because the array is no longer row-contiguous.

I can get around this by defining the array to have Fortran-contiguous order, such that after transposing it will be C-contiguous:

A = np.array([some_func(d) for d in range(D)], order='F').T
A = normalize_logspace(A)

Obviously that's very error-prone and the user has to take care that the array is in the correct order which is something that the user shouldn't need to care about in Python.

What's the best way how I can make this work with both row- and column-contiguous arrays? I assume that some kind of array-order checking in Cython is the way to go. Of course, I'd prefer a solution that doesn't require to copy the data into a new array, but I almost assume that's necessary.

score 8 · Accepted Answer · answered Dec 12 '10 at 15:53

If you want to support arrays in C and Fortran order without ever copying, your C function needs to be flexible enough to support both orderings. This can be achieved by passing the strides of the NumPy array to the C function: Change the prototype to

void normalize_logspace_matrix(size_t nrow, size_t ncol, 
                               size_t rowstride, size_t colstride,
                               double* mat);

and the Cython call to

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d, rowstride, colstride
    n = mat.shape[0]
    d = mat.shape[1]
    rowstride = mat.strides[0] // mat.itemsize
    colstride = mat.strides[1] // mat.itemsize
    normalize_logspace_matrix(n, d, rowstride, colstride, <double*> mat.data)
    return mat

Then, replace every occurence of mat[row*ncol + col] in your C code by mat[row*rowstride + col*colstride].

Is this answer from 2010 still current, or is there a better way to achieve this now? — Fred Foo, Aug 12 '13 at 19:09
@larsmans: I don't know exactly what you mean by "this". Writing a C function that can deal with both Fortran-contiguous and C-contiguous two-dimensional arrays still works this way, if this is what you want. If it's ok that your arrays get copied, there are (and have been in 2010) other solutions. — Sven Marnach, Aug 12 '13 at 22:42

ulidtko · Answer 2 · 2010-12-12T13:50:09.507

3

In this case you really do want to create a copy of the input array (which can be a view on a real array) with guaranteed row-contiguous order. You can achieve this with something like this:

a = numpy.array(A, copy=True, order='C')

Also, consider taking a look at the exact array interface of Numpy (there is C part too).

edited Dec 12 '10 at 13:50

answered Dec 12 '10 at 13:41

ulidtko

14,740
10
56
88

score 0 · Answer 3 · answered May 07 '11 at 13:25

+1 to Sven, whose answer solves the gotcha (well, got me) that dstack returns an F_contiguous array ?!

# don't use dstack to stack a,a,a -> rgb for a C func

import sys
import numpy as np

h = 2
w = 4
dim = 3
exec( "\n".join( sys.argv[1:] ))  # run this.py h= ...

a = np.arange( h*w, dtype=np.uint8 ) .reshape((h,w))
rgb = np.empty( (h,w,dim), dtype=np.uint8 )
rgb[:,:,0] = rgb[:,:,1] = rgb[:,:,2] = a
print "rgb:", rgb
print "rgb.flags:", rgb.flags  # C_contiguous
print "rgb.strides:", rgb.strides  # (12, 3, 1)

dstack = np.dstack(( a, a, a ))
print "dstack:", dstack
print "dstack.flags:", dstack.flags  # F_contiguous
print "dstack.strides:", dstack.strides  # (1, 2, 8)

How to account for column-contiguous array when extending numpy with C

3 Answers3