1

I am trying to calculate the rolling correlation between two xarray dataarrays.

Suppose my dataset is:

<xarray.Dataset>
Dimensions:            (date: 2621, x: 100)
Coordinates:
  * date               (date) datetime64[ns] 2007-01-03 2007-01-04 ...
  * x                  (x) int64 1 2 3 4 5 6 ...
Data variables:
    a                  (date) float64 -0.001011 0.001227 -0.006087 0.002535 ...
    b                  (date, x) float64 -0.001007 -0.0001312 -0.02594 ...

I would like to compute the rolling coefficients between a and b so that the dimensions of each coefficient is (date, x). Note that the date dimension is present because the rolling is applied along the date axis.

I was able to put together and ugly way to do this full of for loops but was wondering if there is a way to do it by somehow applying the reduce function on the rolling dataset object. I can't see a way to do it but there may be an entirely different approach I am missing.

This problem can be generalized by applying any arbitrary function that takes two series of numbers as inputs (in this case the correlation function.

user32430
  • 65
  • 8

1 Answers1

2

One can construct a new rolling window dimension using DatasetRolling.construct, then calculate correlation over the window dim using xarray.corr:

  1. Instantiate as Dataset. Concatenating to a DataArray on a new dim would also work.
import xarray as xr
ds = xr.Dataset({
    'series1': xr.DataArray(np.arange(10), dims='x'),
    'series2': xr.DataArray(np.arange(10, 20), dims='x')
})
ds
# <xarray.Dataset>
# Dimensions:  (x: 10)
# Dimensions without coordinates: x
# Data variables:
#     series1  (x) int64 0 1 2 3 4 5 6 7 8 9
#     series2  (x) int64 10 11 12 13 14 15 16 17 18 19
  1. Instantiate a rolling window object using Dataset.rolling, and construct a new window dimension on this object using DatasetRolling.construct:
rolling = ds.rolling(x=3)
with_dim = rolling.construct('window_dim')
with_dim
# <xarray.Dataset>
# Dimensions:  (window_dim: 3, x: 10)
# Dimensions without coordinates: window_dim, x
# Data variables:
#     series1  (x, window_dim) float64 nan nan 0.0 nan 0.0 ... 7.0 8.0 7.0 8.0 9.0
#     series2  (x, window_dim) float64 nan nan 10.0 nan ... 18.0 17.0 18.0 19.0
  1. Invoke xarray.corr as usual:
xr.corr(with_dim['series1'], with_dim['series2'], dim='window_dim')
# <xarray.DataArray (x: 10)>
# array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
# Dimensions without coordinates: x

A very stale thread, yes, but hopefully this helps someone.

EhoTACC
  • 21
  • 3