numpy
provides a number of different ways of looping over 1 or more dimensions.
Your example:
func = np.sum
for fiber_index in np.ndindex(A.shape[:-1]):
print func(fiber_index)
print A[fiber_index]
produces something like:
(0, 0)
[0 1 2]
(0, 1)
[3 4 5]
(0, 2)
[6 7 8]
...
generates all index combinations over the 1st 2 dim, giving your function the 1D fiber on the last.
Look at the code for ndindex
. It's instructive. I tried to extract it's essence in https://stackoverflow.com/a/25097271/901925.
It uses as_strided
to generate a dummy matrix over which an nditer
iterate. It uses the 'multi_index' mode to generate an index set, rather than elements of that dummy. The iteration itself is done with a __next__
method. This is the same style of indexing that is currently used in numpy
compiled code.
http://docs.scipy.org/doc/numpy-dev/reference/arrays.nditer.html
Iterating Over Arrays
has good explanation, including an example of doing so in cython
.
Many functions, among them sum
, max
, product
, let you specify which axis (axes) you want to iterate over. Your example, with sum
, can be written as:
np.sum(A, axis=-1)
np.sum(A, axis=(1,2)) # sum over 2 axes
An equivalent is
np.add.reduce(A, axis=-1)
np.add
is a ufunc
, and reduce
specifies an iteration mode. There are many other ufunc
, and other iteration modes - accumulate
, reduceat
. You can also define your own ufunc
.
xnx
suggests
np.apply_along_axis(np.sum, 2, A)
It's worth digging through apply_along_axis
to see how it steps through the dimensions of A
. In your example, it steps over all possible i
,j
in a while
loop, calculating:
outarr[(i,j)] = np.sum(A[(i, j, slice(None))])
Including slice
objects in the indexing tuple is a nice trick. Note that it edits a list, and then converts it to a tuple for indexing. That's because tuples are immutable.
Your iteration can applied along any axis by rolling that axis to the end. This is a 'cheap' operation since it just changes the strides.
def with_ndindex(A, func, ax=-1):
# apply func along axis ax
A = np.rollaxis(A, ax, A.ndim) # roll ax to end (changes strides)
shape = A.shape[:-1]
B = np.empty(shape,dtype=A.dtype)
for ii in np.ndindex(shape):
B[ii] = func(A[ii])
return B
I did some timings on 3x3x3, 10x10x10 and 100x100x100 A
arrays. This np.ndindex
approach is consistently a third faster than the apply_along_axis
approach. Direct use of np.sum(A, -1)
is much faster.
So if func
is limited to operating on a 1D fiber (unlike sum
), then the ndindex
approach is a good choice.