Vectorwise iteration of nD array with nditer

Question

Given I have two arrays, say A (shape K,L,M) and B (shape K,M).

I want to iterate vectorwise and construct an output C (shape equal to A) by running a function f on each input vector a and scalar b and then reassembling it into the output (i.e. for each c = f(a, b) (where a = A[i, :, j], b = B[i, j], c as a)). In this example the vector axis would be a.shape[1], but in general it could be any other also.

After reading the documentation page of nditer, I thought it should be appropriate and elegant to use here, since apparently it can allocate everything for you, allows a separate external loop, and easily allows reassembly of the output.

However, I am unable to even make something as simple as a vector-wise copy (again along axis) of an existing array using nditer work properly. Is what I want to do simply not possible with nditer or am I using it wrong?

def test(arr, offsets, axis=0):
    #out = np.zeros_like(arr)
    with np.nditer([arr, None], flags=['external_loop'], #[arr, out]
                   op_flags=[['readonly'], ['writeonly', 'allocate']],
                   op_axes=[[axis], None], #[[axis], [axis]]
                  ) as ndit:
        for i, o in ndit:
            print(i.shape, o.shape)
            o[...] = i
        return ndit.operands[1]
tested = test(xam.data, shifts, axis=1)
print('test output shape', tested2.shape)

>>> (<L>,) (<L>,)
>>> test output shape (<L>,)

This gives an output only of the very first input. Even if I explicitly give an output that has the same shape as input (e.g. via the commented out changes), then the nditer only runs once on the very first length L vector.

>>> (<L>,) (<L>,)
>>> test output shape (<N>, <L>, <M>)

I have made an alternative version using rollaxis views, but it is not particularly pretty or intuitive, so I was wondering if it should not also be possible with nditer, somehow...

def test2(arr, offsets, axis=0):
    arr_r = np.rollaxis(arr, axis).reshape((arr.shape[axis], -1)).T
    out = np.zeros_like(arr)
    out_r = np.rollaxis(out, axis).reshape((arr.shape[axis], -1)).T  # create view
    for i, o in zip(arr_r, out_r):
        o[...] = i

    return out

`nditer` may not be worth your time. Generally it is slower than the equivalent explicit loop, though it can help with some complex setups. But grasping what it does with all the parameters is a lot of work. Mostly I recommend it is a stepping stone toward a `cython` usage, as illustrated on the other doc page, https://numpy.org/doc/stable/reference/arrays.nditer.html#arrays-nditer. It usually is slow, though external loop may compensate. I haven't played with that feature. — hpaulj, Jul 27 '22 at 15:23
That `[axis]` parameter is restricting how `nditer` iterates. I haven't fully figured out what `op_axes` does, but clearly `[[1]]` isn't what you want. — hpaulj, Jul 27 '22 at 17:07

hpaulj · Answer 1 · 2022-07-27T18:16:02.377

Changing your function to work with a list/tuple of axes:

In [378]: def test(arr, offsets, axis=0):
     ...:     #out = np.zeros_like(arr)
     ...:     with np.nditer([arr, None],flags=['external_loop'], #[arr, out]
     ...:                    op_flags=[['readonly'], ['writeonly', 'allocate']],
     ...:                    op_axes=[axis, None], #[[axis], [axis]]
     ...:                   ) as ndit:
     ...:         for i, o in ndit:
     ...:             print(i.shape, o.shape)
     ...:             print(i)
     ...:             o[...] = i
     ...:         return ndit.operands[1]
     ...:

Now it iterates on the whole 2d array. With external_loop it passes a whole (flat) array.

In [379]: test(np.arange(12).reshape((3,4)),0,axis=[0,1])
(12,) (12,)
[ 0  1  2  3  4  5  6  7  8  9 10 11]
Out[379]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [380]: test(np.arange(12).reshape((3,4)),0,axis=[1,0])
(12,) (12,)
[ 0  1  2  3  4  5  6  7  8  9 10 11]
Out[380]: 
array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

test2

Adding the print to test2 to better see what's passed:

In [385]: def test2(arr, offsets, axis=0):
     ...:     arr_r = np.rollaxis(arr, axis).reshape((arr.shape[axis], -1)).T
     ...:     out = np.zeros_like(arr)
     ...:     out_r = np.rollaxis(out, axis).reshape((arr.shape[axis], -1)).T  # create view
     ...:     for i, o in zip(arr_r, out_r):
     ...:         print(i.shape, i)
     ...:         o[...] = i
     ...:     return out
     ...: 
In [386]: test2(np.arange(12).reshape((3,4)),0,axis=0)
(3,) [0 4 8]
(3,) [1 5 9]
(3,) [ 2  6 10]
(3,) [ 3  7 11]
Out[386]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [387]: test2(np.arange(12).reshape((3,4)),0,axis=1)
(4,) [0 1 2 3]
(4,) [4 5 6 7]
(4,) [ 8  9 10 11]
Out[387]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

timings

Taking out the prints to do timings:

nditer:

In [391]: timeit test0(np.arange(12).reshape((3,4)),0,axis=(0,1))
11.6 µs ± 36.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

iteration:

In [392]: timeit test20(np.arange(12).reshape((3,4)),0,axis=0)
26.5 µs ± 732 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

nditer, but without external_loop

In [395]: timeit test01(np.arange(12).reshape((3,4)),0,axis=(0,1))
17.9 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Often in time tests nditer performs slower. Here though the external_loop case only has to iterate once, passing the whole flattened array to the body.

big picture

So far we are just trying to iterate through a 2d array. In the intro you talk of using

 A (shape K,L,M) and B (shape K,M).

Normally in numpy we try to avoid any iteration. If B is (K,1,M) (as with B[:,None,:], then we can do all kinds of things with them

 C = A + B[:,None]
 C = A * B[:,None]

without needing to iterate. Any python level iteration with arrays slows down the code.

Vectorwise iteration of nD array with nditer

1 Answers1

test2

timings

big picture

Linked