Numpy: Replace every value in the array with the mean of its adjacent elements

Question

I have an ndarray, and I want to replace every value in the array with the mean of its adjacent elements. The code below can do the job, but it is super slow when I have 700 arrays all with shape (7000, 7000) , so I wonder if there are better ways to do it. Thanks!

a = np.array(([1,2,3,4,5,6,7,8,9],[4,5,6,7,8,9,10,11,12],[3,4,5,6,7,8,9,10,11]))
row,col = a.shape
new_arr = np.ndarray(a.shape)
for x in xrange(row):
    for y in xrange(col):
        min_x = max(0, x-1)
        min_y = max(0, y-1)
        new_arr[x][y] = a[min_x:(x+2),min_y:(y+2)].mean()
print new_arr

It's not running slowly for me at all, nor does it look like it should run slowly. — Eli Sadoff, Jul 06 '16 at 19:47
@EliSadoff it will if I have 700 arrays all with shape (7000,7000) ... — Chiefscreation, Jul 06 '16 at 19:49
I'm not sure how to do it in python but have you considered multi-threading or parallel processing? I know in C you can do this to speed up large data processing. — Michael, Jul 06 '16 at 19:54

Divakar · Accepted Answer · 2016-07-06T20:29:10.563

12

Well, that's a smoothing operation in image processing, which can be achieved with 2D convolution. You are working a bit differently on the near-boundary elements. So, if the boundary elements are let off for precision, you can use scipy's convolve2d like so -

from scipy.signal import convolve2d as conv2

out = (conv2(a,np.ones((3,3)),'same')/9.0

This specific operation is a built-in in OpenCV module as cv2.blur and is very efficient at it. The name basically describes its operation of blurring the input arrays representing images. I believe the efficiency comes from the fact that internally its implemented entirely in C for performance with a thin Python wrapper to handle NumPy arrays.

So, the output could be alternatively calculated with it, like so -

import cv2 # Import OpenCV module

out = cv2.blur(a.astype(float),(3,3))

Here's a quick show-down on timings on a decently big image/array -

In [93]: a = np.random.randint(0,255,(5000,5000)) # Input array

In [94]: %timeit conv2(a,np.ones((3,3)),'same')/9.0
1 loops, best of 3: 2.74 s per loop

In [95]: %timeit cv2.blur(a.astype(float),(3,3))
1 loops, best of 3: 627 ms per loop

edited Jul 06 '16 at 20:29

answered Jul 06 '16 at 20:00

Divakar

218,885
19
262
358

Nicely done. I was about to write something like this. +1. – rayryeng Jul 06 '16 at 20:06
@rayryeng Good to see you with NumPy tag! ;) – Divakar Jul 06 '16 at 20:07
@Divakar It's on my feed :) I haven't been answering questions lately though... lots of stuff going on on my end. – rayryeng Jul 06 '16 at 20:08
1

[uniform_filter](http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.ndimage.filters.uniform_filter.html) is a short-cut for this... but since the `signal` module operates in the Fourier domain, I guess this would be faster? – Imanol Luengo Jul 06 '16 at 20:09
@ImanolLuengo Ah uniform_filter seems much faster! Post that as a solution! – Divakar Jul 06 '16 at 20:12
@Divakar I guess.. if `uniform_filter` is faster it must be because it does two 1D convolutions (by separating the kernel). I will check something before :P – Imanol Luengo Jul 06 '16 at 20:14
@ImanolLuengo Yeah, I am not sure about the internals of it, could be that. – Divakar Jul 06 '16 at 20:15
@Divakar [just checked](https://github.com/scipy/scipy/blob/v0.15.1/scipy/ndimage/filters.py#L765), internally it calls N times (one for dimension) to `uniform_filter1d`, which is probably why it gets the speedup over the `signal` module. For all other non-separable filters, your `conv2d` should always be faster. – Imanol Luengo Jul 06 '16 at 20:33
Thanks @ImanolLuengo and Divakar for the answers! One more question, if I want to change mean to max, do you know what commands can do this? There doesn't seem to have max option in convolve2d... – Chiefscreation Jul 07 '16 at 18:31
1

@Chiefscreation Well there is max filter too in scipy : http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.ndimage.filters.maximum_filter.html – Divakar Jul 07 '16 at 18:55

Imanol Luengo · Answer 2 · 2016-07-06T21:50:55.040

Following the discussion with @Divakar, find bellow a comparison of different convolution methods present in scipy:

import numpy as np
from scipy import signal, ndimage

def conv2(A, size):
    return signal.convolve2d(A, np.ones((size, size)), mode='same') / float(size**2)

def fftconv(A, size):
    return signal.fftconvolve(A, np.ones((size, size)), mode='same') / float(size**2)

def uniform(A, size):
    return ndimage.uniform_filter(A, size, mode='constant')

All 3 methods return exactly the same value. However, note that uniform_filter has a parameter mode='constant', which indicates the boundary conditions of the filter, and constant == 0 is the same boundary condition that the Fourier domain (in the other 2 methods) is enforced. For different use cases you can change the boundary conditions.

Now some test matrices:

A = np.random.randn(1000, 1000)

And some timings:

%timeit conv2(A, 3)     # 33.8 ms per loop
%timeit fftconv(A, 3)   # 84.1 ms per loop
%timeit uniform(A, 3)   # 17.1 ms per loop

%timeit conv2(A, 5)     # 68.7 ms per loop
%timeit fftconv(A, 5)   # 92.8 ms per loop
%timeit uniform(A, 5)   # 17.1 ms per loop

%timeit conv2(A, 10)     # 210 ms per loop
%timeit fftconv(A, 10)   # 86 ms per loop
%timeit uniform(A, 10)   # 16.4 ms per loop

%timeit conv2(A, 30)     # 1.75 s per loop
%timeit fftconv(A, 30)   # 102 ms per loop
%timeit uniform(A, 30)   # 16.5 ms per loop

So in short, uniform_filter seems faster, and it because the convolution is separable in two 1D convolutons (similar to gaussian_filter which is also separable).

Other non-separable filters with different kernels are more likely to be faster using signal module (the one in @Divakar's) solution.

The speed of both fftconvolve and uniform_filter remains constant for different kernel sizes, while convolve2d gets slightly slower.

@Divakar just found something quite interesting, for different *small* kernel sizes, the last 2 methods maintain *constant* execution time. — Imanol Luengo, Jul 06 '16 at 20:38
I think I understood the uniform_filter's implementation with two passes to process this. That `fftconvolve`'s internal implementation looks messy though. Thanks again for coming up with those useful findings! — Divakar, Jul 06 '16 at 21:01
@Divakar [wikipedia](https://en.wikipedia.org/wiki/Separable_filter) actually explains pretty well what `uniform_filter` is doing (essentially same example)! — Imanol Luengo, Jul 06 '16 at 21:50

score 0 · Answer 3 · answered Aug 17 '17 at 20:58

I had a similar problem recently and had to find a different solution since I can't use scipy.

import numpy as np

a = np.random.randint(100, size=(7000,7000)) #Array of 7000 x 7000
row,col = a.shape

column_totals =  a.sum(axis=0) #Dump the sum of all columns into a single array

new_array = np.zeros([row,col]) #Create an receiving array

for i in range(row):
    #Resulting row = the value of all rows minus the orignal row, divided by the row number minus one. 
    new_array[i] = (column_totals - a[i]) / (row - 1)

print(new_array)

Numpy: Replace every value in the array with the mean of its adjacent elements

3 Answers3

Linked