I'm trying to understand an algorithm of Max-Pooling in numpy. There are many answers like this that offer to give a new 4 - dimensional shape to two - dimensional image and then call np.max
on axis 1 and 3:
window = (2, 4)
arr = np.random.randint(99, size=(1,8,12))
shape = (arr.shape[1]//window[0], window[0], arr.shape[2]//window[1], window[1])
out = arr.reshape(shape).max(axis=(1, 3))
According to my visual understanding, I should operate on axis=(0, 2)
so it will shrink to the size 1 and produce an output like so:
That makes a lot of sense but it's not correct:
arr = np.random.randint(99, size=(1,8,12)) =
[[[ 7 55 21 88 69 35 7 7 73 54 16 80]
[70 79 62 55 42 5 77 81 38 52 69 39]
[58 78 48 35 5 93 47 64 18 25 73 25]
[14 8 63 27 28 46 29 68 28 38 51 79]
[70 15 37 51 72 27 44 79 1 79 75 9]
[ 4 27 0 90 15 30 95 62 14 8 69 57]
[24 29 26 44 72 89 74 78 39 29 6 2]
[82 12 0 11 54 38 61 79 91 92 53 28]]]
--------------------------------------------------
arr.reshape(4, 2, 3, 4).max(axis=(0, 2)) =
[[73 93 75 88]
[91 92 95 90]]
--------------------------------------------------
arr.reshape(4, 2, 3, 4).max(axis=(1, 3)) =
[[88 81 80]
[78 93 79]
[90 95 79]
[82 89 92]]
So it doesn't ever agree with my picture in reality. What is the source of this disagreement? What are the reasons it's not working as expected?