I'm writing some python + numpy + cython code, and am trying to find the most elegant and efficient way of doing the following kind of iteration over an array:
Let's say I have a function f(x, y) that takes a vector x of shape (3,) and a vector y of shape (10,) and returns a vector of shape (10,). Now I have two arrays X and Y of shape sx + (3,) and sy + (10,), where the sx and sy are two shapes that can be broadcast together (i.e. either sx == sy, or when an axis differs, one of the two has length 1, in which case it will be repeated). I want to produce an array Z that has the shape zs + (10,), where zs is the shape of the broadcasting of sx with sy. Each 10 dimensional vector in Z is equal to f(x, y) of the vectors x and y at the corresponding locations in X and Y.
I looked into np.nditer and while it plays nice with cython (see bottom of linked page), it doesn't seem to allow iterating over vectors from a multidimensional array, instead of elements. I also looked at index grids, but the problem there is that cython indexing is only fast when the number of indexes is equal to the dimensionality of the array, and are stored as cython integers instead of python tuples.
Any help is greatly appreciated!