3

When I do running / rolling mean with weights in numpy, I e.g. do something like this:

data = np.random.random(100)  # Example data...
weights = np.array([1, 2, 1])
data_m = np.convolve(data, weights/float(np.sum(weights)), "same")

And then replace data_m[0] and data_m[-1] with e.g. nans, depending on application.

Something alike can be done with xarray. What I do (in this case) is

xr.DataArray(data).rolling(dim_0=3, center=True).mean(dim="dim_0")

But this corresponds to the weights

weights = np.array([1, 1, 1])

in the numpy example. How would I apply other weights, when using xarray?

Hallgeir Wilhelmsen
  • 1,014
  • 1
  • 10
  • 18

3 Answers3

8

The weighted-rolling-mean is not yet implemented in xarray.

The following does almost the same thing but it would be quite slow. I think the use of np.convolve is the current best choice.

def weighted_sum(x, axis):
    weight = [1, 2, 1]
    if x.shape[axis] == 3:
        return np.sum(x * weight, axis=axis)
    else:
        return np.nan

da.rolling(dim_0=3, center=True).reduce(weighted_sum)

Currently, we are working to support more flexible (and faster) rolling operations. See https://github.com/pydata/xarray/pull/1837

EDIT:

With xarray=0.10.2, weighted rolling mean can be computed as follows,

weight = xr.DataArray([0.25, 0.5, 0.25], dims=['window'])
da.rolling(dim_0=3, center=True).construct('window').dot(weight)

where construct method constructs a view of the rolling object, where the window dimension (named window in the above example) is attatched to the last position. inner product with the weight array gives the weighted sum along the window dimension.

Keisuke FUJII
  • 1,306
  • 9
  • 13
  • 1
    not sure, but I think you need to divide by the sum of the weights, right? da.rolling(dim_0=3, center=True).construct('window').dot(weight)/weight.sum() – iury simoes-sousa Oct 27 '20 at 20:10
4

If you want a Gaussian-like filter, another hack is to apply the rolling mean recursively.

Infinite recursions of the boxcar filter (i.e., our rolling mean) becomes a Gaussian filter. See B-spline in wikipedia for the detail.

Example:

x = xr.DataArray([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], dims=['x'])

# With window=2
tmp = x
plt.plot(tmp, '-ok', label='original')
for i in range(3):
    tmp = tmp.rolling(x=2, min_periods=1).mean()
    plt.plot(tmp, '-o', label='{}-times'.format(i+1))
plt.legend()

recursive rolling mean with window size 2

# with window=3, center=True
tmp = x
plt.plot(tmp, '--ok', label='original')
for i in range(3):
    tmp = tmp.rolling(x=3, center=True, min_periods=1).mean()
    plt.plot(tmp, '-o', label='{}-times'.format(i+1))
plt.legend()

recursive rolling mean with window size 3 with the centralization

Note: if you want to centralize the result, use the odd window size.

Keisuke FUJII
  • 1,306
  • 9
  • 13
0

This is specific for the [1,2,1] weights, and it requires two steps, so it is not the best solution, but it is quite quick:

dim_name = "dim_0"
da_mean = da.rolling(**{dim_name: 3, "center": True}).mean(dim=dim_name)
da_mean = (3 * da_mean + da) / 4.  # Expand it, and add the middle value.
Hallgeir Wilhelmsen
  • 1,014
  • 1
  • 10
  • 18