1

I have a numpy.ndarray x of shape (...,3) i.e. with an arbitrary number of axes, the last axis having a known size of 3. I also have a function f that takes as argument an array of shape (3) (in fact a point in a 3D space) and returns another array of shape (3) (in fact a vector in a 3D space). Unfortunately, this function cannot (at least easily) be vectorized.

Using numpy.nditer, how can I efficiently parse the array x along all its axes except the last one in order to fill an array y (whose shape is equal to the one of x) with the result of f?

The following piece of code will do it without use of nditer :

import numpy as np

def f(x):
   '''Simple function for this exemple.
   Can only deal with array of shape (3,)
   '''
   assert x.ndim == 1 and x.shape[0] == 3
   y = np.zeros_like(x)
   y[0] = x[0]
   y[1] = x[1]**2
   y[2] = x[2]**3
   return y

x = np.arange(15).reshape(5,3)

_x = x.reshape(-1,3)
_y = np.zeros_like(_x)
for i in xrange(_x.shape[0]):
    _y[i,:] = f(_x[i,:])
y = _y.reshape(x.shape)

but does not look 'pythonic' to me.

As a bonus question, will there be an advantage in terms of speed in using nditer rather than the classical python loop above ?

SylM
  • 68
  • 8
  • Not sure if I have fully grapsed the question(s) in this post, but for that particular `f`, you can just do `x**[1,2,3]`. – Divakar Feb 16 '16 at 13:51
  • 1
    `nditer` is not any faster. It is most useful as stepping stone to using it in `c` or `cython`. But for this problem, Look at `ndindex`, especially its code. Also search SO for it. – hpaulj Feb 16 '16 at 14:16

1 Answers1

2

The core of what you are doing is reshaping the array to 2d, iterating on one axis, and reshaping back

_x = x.reshape(-1,3)
_y = np.zeros_like(_x)
for i in xrange(_x.shape[0]):
    _y[i,:] = f(_x[i,:])
y = _y.reshape(x.shape)

compare that to what tensordot does:

newshape_a = (-1, N2)
....
at = a.transpose(newaxes_a).reshape(newshape_a)
bt = b.transpose(newaxes_b).reshape(newshape_b)
res = dot(at, bt)
return res.reshape(olda + oldb)

Basically the same strategy. If it doesn't look sufficiently 'pythonic' you can hide the messy details in a function. :)

This kind of reshaping is most useful when the underlying function can handle 2 dimensions, one active, and one that is passive, 'going along for the ride'. Transpose can move the active axis to the front or back, depending on what's most convenient.

Another strategy, used in apply_along_axis is to construct an indexing list:

for i in range(N):
    fun(arr[tuple([slice(N),slice(N)...,i])]

I've answered similar questions in the past about nditer. https://stackoverflow.com/a/28727290/901925

np.ndindex is a good example of using nditer to iterate over a subset of the axes. Look at its code. Basically it constructs a dummy array of the right shape, and generates indices of the multi_index type.

This is also illustrated at http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#tracking-an-index-or-multi-index

That indexing doc is the best description of nditer. Notice how it ends with an cython example. I think that's the best use of the Python nditer - as a stepping stone to using it in cython or c.

In Python it can be useful as a way of iterating over several input and output arrays in a coordinated fashion, but it does not have any speed advantages.

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Thanks for this very documented answer. When I'll try to "cythonize" this part of the code, I will get back to the `nditer` solution. – SylM Mar 07 '16 at 15:52