Say you have the following N-dimensional array
>>> import numpy as np
>>> Z = np.array(7*[6*[5*[4*[3*[range(2)]]]]])
>>> Z.ndim
6
Note that N = 6, but I want to keep it arbitrary in the discussion.
Then I perform multiple-axis operations which -- of course subjectively -- problematically "collapse" (as described here and there) dimensions along which those are computed.
Say that the axes of computation are
>>> axes = (0,2,5)
Thus the tuple's length belongs to [1,N].
As you might have guessed, I want to make the shape of the output of a, say,
np.mean
be the same as that of its input. E.g.
>>> Y = np.mean(Z, axis=axes)
>>> Y.shape
(6L, 4L, 3L)
while
>>> Z.shape
(7L, 6L, 5L, 4L, 3L, 2L)
I have a homemade solution, as follows
def nd_repeat(arr, der, axes):
if not isinstance(axes, tuple):
axes = (axes,)
shape = list(arr.shape)
for axis in axes:
shape[axis] = 1
return np.zeros_like(arr) + der.reshape(shape)
where incidentally der
stands for "derived".
>>> nd_repeat(Z, Y, axes).shape
(7L, 6L, 5L, 4L, 3L, 2L)
What is the numpy-builtin manner to accomplish this N-dimensional repeat?
Performance concerns,
import timeit
homemade_s = """\
nd_repeat(Z, np.nanpercentile(Z, 99.9, axis=axes), axes)
"""
homemade_t = timeit.Timer(homemade_s, "from __main__ import nd_repeat,Z,axes,np").timeit(10000)
npbuiltin_s = """\
np.broadcast_to(np.nanpercentile(Z, 99.9, axis=axes, keepdims=True), Z.shape)
"""
npbuiltin_t = timeit.Timer(npbuiltin_s, "from __main__ import Z,axes,np").timeit(10000)
As can be expected
>>> np.log(homemade_t/npbuiltin_t)
0.024082885343423521
my solution is ~2.5% slower than hpaulj's.