0

I have an array of dimensions (9131,101,191). The first dimension is the days from 1/1/2075 till 31/12/2099. I want to extract all the days which are in the month of July. How can I do this in xarray? I have tried using loops and numpy but not getting the desired result. Ultimately, I want to extract all the arrays which are falling in July and find the mean.

Here is the array, its name is initialize_c3 and its shape is (9131,101,191).

import xarray as xr

arr_c3 = xr.DataArray(initialize_c3,
    dims=("time", "lat", "lon"),
    coords={"time": pd.date_range("2075-01-01", periods=9131, freq="D"),"lat": list(range(1, 102)),"lon": list(range(1, 192)),    
    },) 

I have tried to groupby according to months. try = arr_c3.groupby(arr_c3.time.dt.month) After this the shape of try is (755,1,1) but want the dimensions of try to be (755,101,191). What I am doing wrong?

jhamman
  • 5,867
  • 19
  • 39
Sohaib
  • 3
  • 2
  • what is `xr`? is it a library or a variable ou are defining somewhere? Please post the code that you have – Sembei Norimaki Jan 10 '23 at 10:31
  • @SembeiNorimaki xr has come from xarray. i am importing xarray as xr. – Sohaib Jan 10 '23 at 10:51
  • you should include the relevant code in your question. – Sembei Norimaki Jan 10 '23 at 10:55
  • I have done that – Sohaib Jan 10 '23 at 11:07
  • 1
    Tried executing your code. It says `initialize_c3` doesn't exist. Is the code you posted a minimal reproducible example that we can execute? – Sembei Norimaki Jan 10 '23 at 11:59
  • Actually, the initialize_c3 is an array which cannot be uploaded here. Could you import xarray as xr arr_c3 = xr.DataArray(initialize_c3, dims=("time", "lat", "lon"), coords={"time": pd.date_range("2075-01-01", periods=9131, freq="D"),"lat": list(range(1, 102)),"lon": list(range(1, 192)), },) – Sohaib Jan 10 '23 at 12:42

1 Answers1

0

You can use groupby() to calculate the monthly climatology. Then use sel() to select the monthly mean for July:

ds.groupby('time.month').mean().sel(month=7)

Another way that avoids calculating the means for the other months is to first filter all days in July:

ds.sel(time=(ds.time.dt.month == 7)).mean('time')
jhamman
  • 5,867
  • 19
  • 39