Looking for fastest method to downsample a 3d array based on occurences using numpy

Question

Given a large 3d numpy array (not too large to fit in memory) with type 'uint8', I would like to downscale this array with a given scalefactor in each dimension. You may assume the shape of the array is dividable by the scale factor.

The values of the array are in [0, 1, ... max] where max is always smaller than 6. I would like to scale it down such that each 3d block with shape "scale_factor" will return the number that occurs most in this block. When equal return the first (I don't care).

I have tried the following which works

import numpy as np

array = np.random.randint(0, 4, ((128, 128, 128)), dtype='uint8')
scale_factor = (4, 4, 4)
bincount = 3

# Reshape to free dimension of size scale_factor to apply scaledown method to
m, n, r = np.array(array.shape) // scale_factor
array = array.reshape((m, scale_factor[0], n, scale_factor[1], r, scale_factor[2]))


# Making histogram, first over last axis, then sum over other two
array = np.apply_along_axis(lambda x: np.bincount(x, minlength=bincount),
                            axis=5, arr=array)
array = np.apply_along_axis(lambda x: np.sum(x), axis=3, arr=array)
array = np.apply_along_axis(lambda x: np.sum(x), axis=1, arr=array).astype('uint8')

array = np.argmax(array , axis=3)

This worked, but the bincount is terribly slow. Also got np.histogram to work, but also very slow. I do think both methods I tried are not completely designed for my purpose, they offer many more features which slows down the methods.

My question is, does anyone know a faster method? I would also be happy if someone could point me to a method from a deep learning library that does this, but that is not officially the question.

apply_along_axis is a fancy python for loop in my understanding. — Mad Physicist, Mar 22 '20 at 04:31
What I'm saying is that it's not bincount that's slowing you down — Mad Physicist, Mar 22 '20 at 04:35

score 4 · Accepted Answer · answered Mar 22 '20 at 13:38

@F.Wessels is thinking in the right direction, but the answer's not quite there yet. The speed can be improved by more than a hundred times if you take matters into your own hands instead of using apply along axis.

First of all, when you divide your 3D array space into blocks, your dimensions change from 128x128x128 to 32x4x32x4x32x4. Try intuitively understanding this: you have effectively 32x32x32 blocks of size 4x4x4. Instead of keeping the blocks as 4x4x4, you can just keep them squeezed as size 64, from where you can find the most frequent item. Here's the trick: It also doesn't matter if your blocks are not arranged as 32x32x32x64 but as 32768x64 instead. Basically, we've jumped back into 2D dimensions, where everything is easier.

Now with your 2D array of size 32768x64, you can do the bin-counting yourself with list comprehension and numpy ops; it'll be 10 times as fast.

import time
import numpy as np

array = np.random.randint(0, 4, ((128, 128, 128)), dtype='uint8')
scale_factor = (4, 4, 4)
bincount = 4

def prev_func(array):
    # Reshape to free dimension of size scale_factor to apply scaledown method to
    m, n, r = np.array(array.shape) // scale_factor
    arr = array.reshape((m, scale_factor[0], n, scale_factor[1], r, scale_factor[2]))
    arr = np.swapaxes(arr, 1, 2).swapaxes(2, 4)
    arr = arr.reshape((m, n, r, np.prod(scale_factor)))
    # Obtain the element that occurred the most
    arr = np.apply_along_axis(lambda x: max(set(x), key=lambda y: list(x).count(y)),
                              axis=3, arr=arr)
    return arr

def new_func(array):
    # Reshape to free dimension of size scale_factor to apply scaledown method to
    m, n, r = np.array(array.shape) // scale_factor
    arr = array.reshape((m, scale_factor[0], n, scale_factor[1], r, scale_factor[2]))
    arr = np.swapaxes(arr, 1, 2).swapaxes(2, 4)
    arr = arr.reshape((m, n, r, np.prod(scale_factor)))
    # Collapse dimensions
    arr = arr.reshape(-1,np.prod(scale_factor))
    # Get blockwise frequencies -> Get most frequent items
    arr = np.array([(arr==b).sum(axis=1) for b in range(bincount)]).argmax(axis=0)
    arr = arr.reshape((m,n,r))
    return arr

N = 10

start1 = time.time()
for i in range(N):
    out1 = prev_func(array)
end1 = time.time()
print('Prev:',(end1-start1)/N)

start2 = time.time()
for i in range(N):
    out2 = new_func(array)
end2 = time.time()
print('New:',(end2-start2)/N)

print('Difference:',(out1-out2).sum())

Out:

Prev: 1.4244404077529906
New: 0.01667332649230957
Difference: 0

As you can see, there is no difference in results, even though I've juggled the dimensions around. Numpy's reshape function maintained the order of the values when I went to 2D, since the last dimension 64 was retained. This order is reestablished when I reshape back to m,n,r. The original method you gave took around 5 seconds to run on my machine, so empirically there's a five hundred times speed improvement.

counts = np.zeros_like(arr); np.add.at(counts.reshape(-1), (arr + np.arange(len(arr))[:, None] * bincount).flatten(), 1); return counts.argmax(axis=1) — Eelco Hoogendoorn, Mar 22 '20 at 16:05
The above makes only a single alloc of size 128,bincount; thus avoiding O(bincount**2) complexity. Probably no big deal for small bincount; but it might add up quickly. — Eelco Hoogendoorn, Mar 22 '20 at 16:07
Amazing. This works 15x faster than applying `scipy.stats.mode` to the same thing. Not entirely surprising for a small number of possibilities. — Mad Physicist, Jan 07 '22 at 06:17

F.Wessels · Answer 2 · 2020-03-22T20:21:24.543

Well, here is a similar method, but faster. It only replaces the bincount function to a simpler implementation based on your usecase: lambda x: max(set(x), key=lambda y: list(x).count(y)) where first the array is reshaped such that the method can directly be used on 1 dimension.

On my laptop with 128x128x128 it is arround 4 times faster:

import time
import numpy as np

array = np.random.randint(0, 4, ((128, 128, 128)), dtype='uint8')
scale_factor = (4, 4, 4)
bincount = 4

start_time = time.time()
N = 10
for i in range(N):

    # Reshape to free dimension of size scale_factor to apply scaledown method to
    m, n, r = np.array(array.shape) // scale_factor
    arr = array.reshape((m, scale_factor[0], n, scale_factor[1], r, scale_factor[2]))
    arr = np.swapaxes(arr, 1, 2).swapaxes(2, 4)
    arr = arr.reshape((m, n, r, np.prod(scale_factor)))

    # Obtain the element that occurred the most
    arr = np.apply_along_axis(lambda x: max(set(x), key=lambda y: list(x).count(y)),
                              axis=3, arr=arr)

print((time.time() - start_time) / N)

There is still a big gap with for example the build in methods like np.mean()

Looking for fastest method to downsample a 3d array based on occurences using numpy

2 Answers2