I'm trying to obtain an array containing the moving averages along the rows of a 2-dimensional numpy array, based on a certain 'window' (i.e. the number of rows included in the average) and an 'offset'. I've come up with the code below which I know is not efficient:
import numpy as np
def f(array, window, offset):
x = np.empty(array.shape)
x[:,:] = np.NaN
for row_num in range(array.shape[0]):
first_row = row_num - window - offset
last_row = row_num - offset + 1
if first_row >= 0:
x[row_num] = np.nanmean(array[first_row:last_row], axis=0)
return x
I've found a potential solution here, adapted below for my code:
import math
from scipy.ndimage import uniform_filter
def g(array, window, offset):
return uniform_filter(array, size=(window+1,1), mode='nearest', origin=(math.ceil((window+1)/2-1),0))
This solution, however, has 3 problems:
- First, I'm not sure how to implement the 'offset'
- Second, I'm not sure whether it is indeed more efficient
- Third, and most importantly, it doesn't work when the input array contains np.nan. The moment np.nan is found, it gets dragged down in the moving average, instead of following the np.nanmean behaviour.
Is there an efficient way to achieve what I'm trying to get?
Update
As suggested by Ehsan, I've implemented the code below (with a small modification), which works as my original code for any offset above 0:
from skimage.util import view_as_windows
def h(array, window, offset):
return np.vstack(([[np.NaN]*array.shape[-1]]*(window+offset),np.vstack(np.nanmean(view_as_windows(array,(window+1,array.shape[-1])),-2)[:-offset])))
I'm just not sure how to make it work for any offset (in particular, offset=0). Also, this solution seems to consume more time than the original one:
a = np.arange(10*11).reshape(10,11)
%timeit f(a, 5, 2)
%timeit h(a, 5, 2)
>>> 36.6 µs ± 709 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> 67.5 µs ± 2.34 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I was wondering if there's any alternative which is less time consuming