python xarray concat groupby in datetime64 dimensions

Question

I have a xarray dataset that is:

ds
<xarray.Dataset>    
Dimensions:  (lat: 360, lon: 720, time: 3652)
Coordinates:
  * lon      (lon) float32 -179.75 -179.25 -178.75 -178.25 -177.75 -177.25     ...
  * lat      (lat) float32 89.75 89.25 88.75 88.25 87.75 87.25 86.75 86.25 ...
* time     (time) datetime64[ns] 2010-01-01 2010-01-02 2010-01-03 ...
Data variables:
dis    (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan...

There are nans in the dis variable but the whole array is not nans. The length of dimension time corresponds to 10 years of daily data (3652 days).

What I want to do is get monthly means of the 10 yr timeseries, for each month and each gridsquare (lat,lon). So output dataset would be:

Dimensions:  (lat: 360, lon: 720, time: 12)  #<<< or 'months'

One option I saw that almost does what I want is:

ds.dis.groupby('time.month').mean()

However the output of this is just an 12-item array. i.e. we lose both lat and lon dimensions.

<xarray.DataArray 'dis' (month: 12)>
array([ 368.26764123,  394.0543304 ,  424.67056092,  476.94943773,
    522.383195  ,  516.37355647,  497.74700652,  472.46993274,
    456.87268206,  402.44729131,  367.41928436,  362.6121917 ])
Coordinates:
* month    (month) int64 1 2 3 4 5 6 7 8 9 10 11 12

I figure there are probably simple ways to do this using the datetime64 methods but I have struggled to make full sense of them.

Alas, whilst writing this I have managed by doing:

stacked = xr.concat([ds.dis[tlist[month,:],:,:].mean(dim='time',skipna=True) for month in range(0,12)],dim='month')

which gives:

<xarray.DataArray 'dis' (month: 12, lat: 360, lon: 720)>

However, is there another more pythonic way more in line with the first line of code using groupby?

Thanks

score 2 · Accepted Answer · answered May 27 '16 at 17:45

To avoid aggregating over all dimensions in each sub-arrays, you need to supply the list of dimensions explicitly:

ds.dis.groupby('time.month').mean('time')

(At one point we contemplated making this the default behavior for groupby operations, since it is usually what is desired, but then it's not clear how to trigger the current default of summing over all dimensions.)

python xarray concat groupby in datetime64 dimensions

1 Answers1