Fast iteration over vectors in a multidimensional numpy array

Question

I'm writing some python + numpy + cython code, and am trying to find the most elegant and efficient way of doing the following kind of iteration over an array:

Let's say I have a function f(x, y) that takes a vector x of shape (3,) and a vector y of shape (10,) and returns a vector of shape (10,). Now I have two arrays X and Y of shape sx + (3,) and sy + (10,), where the sx and sy are two shapes that can be broadcast together (i.e. either sx == sy, or when an axis differs, one of the two has length 1, in which case it will be repeated). I want to produce an array Z that has the shape zs + (10,), where zs is the shape of the broadcasting of sx with sy. Each 10 dimensional vector in Z is equal to f(x, y) of the vectors x and y at the corresponding locations in X and Y.

I looked into np.nditer and while it plays nice with cython (see bottom of linked page), it doesn't seem to allow iterating over vectors from a multidimensional array, instead of elements. I also looked at index grids, but the problem there is that cython indexing is only fast when the number of indexes is equal to the dimensionality of the array, and are stored as cython integers instead of python tuples.

Any help is greatly appreciated!

Would you be able to add a concrete example to your requirements? As it is now, it is not as easy to follow your train of thought. — Oliver W., Nov 19 '14 at 20:39
I would, but the links in Jaime's (rather quick) answer clearly explain the concept I had in mind. — John von N., Nov 19 '14 at 22:33
`multi_index` can be used to control the depth of `nditer` iteration. Basically you construct an iterator over `zs` shape, and use that to index `X`, `Y` and the output. http://stackoverflow.com/a/25097271/901925 — hpaulj, Nov 20 '14 at 04:15

score 5 · Accepted Answer · answered Nov 19 '14 at 21:34

You are describing what Numpy calls a Generalized Universal FUNCtion, or gufunc. As it name suggests, it is an extension of ufuncs. You probably want to start by reading these two pages:

The second example uses Cython and has some material on gufuncs. To fully go down the gufunc road, you will need to read the corresponding section in the numpy C API documentation:

Generalized Universal Function API

I do not know of any example of gufuncs being coded in Cython, although it shouldn't be too hard to do following the examples above. If you want to look at gufuncs coded in C, you can take a look at the source code for np.linalg here, although that can be a daunting experience. A while back I bored my local Python User Group to death giving a talk on extending numpy with C, which was mostly about writing gufuncs in C, the slides of that talk and a sample Python module providing a new gufunc can be found here.

Thanks! This seems to be exactly what I was looking for. I knew this had to be a common problem, but I didn't know about generalized ufuncs yet. — John von N., Nov 19 '14 at 21:55

score 2 · Answer 2 · edited May 23 '17 at 12:14

If you want to stick with nditer, here's a way using your example dimensions. It's pure Python here, but shouldn't be hard to implement with cython (though it still has the tuple iterator). I'm borrowing ideas from ndindex as described in shallow iteration with nditer

The idea is to find the common broadcasting shape, sz, and construct a multi_index iterator over it.

I'm using as_strided to expand X and Y to usable views, and passing the appropriate vectors (actually (1,n) arrays) to the f(x,y) function.

import numpy as np
from numpy.lib.stride_tricks import as_strided

def f(x,y):
    # sample that takes (10,) and (3,) arrays, and returns (10,) array
    assert x.shape==(1,10), x.shape
    assert y.shape==(1,3), y.shape
    z = x*10 + y.mean()
    return z

def brdcast(X, X1):
    # broadcast X to shape of X1 (keep last dim of X)
    # modeled on np.broadcast_arrays
    shape = X1.shape + (X.shape[-1],)
    strides = X1.strides + (X.strides[-1],)
    X1 = as_strided(X, shape=shape, strides=strides)
    return X1

def F(X, Y):
    X1, Y1 = np.broadcast_arrays(X[...,0], Y[...,0])
    Z = np.zeros(X1.shape + (10,))
    it = np.nditer(X1, flags=['multi_index'])

    X1 = brdcast(X, X1)
    Y1 = brdcast(Y, Y1)

    while not it.finished:
        I = it.multi_index + (None,)
        Z[I] = f(X1[I], Y1[I])
        it.iternext()
    return Z

sx = (2,3)  # works with (2,1)
sy = (1,3)
# X, Y = np.ones(sx+(10,)), np.ones(sy+(3,))

X = np.repeat(np.arange(np.prod(sx)).reshape(sx)[...,None], 10, axis=-1)
Y = np.repeat(np.arange(np.prod(sy)).reshape(sy)[...,None], 3, axis=-1)

Z = F(X,Y)
print Z.shape
print Z[...,0]

Fast iteration over vectors in a multidimensional numpy array

2 Answers2