0

I have multiple .nc files that I combined using xr.open_mfdataset, floowed by subsetting it for study region and applying the mask using regionmask and resampled it to take hourly mean, but it is giving wrong answer.

ds =xr.open_mfdataset("../t2m/*.nc")

has houly data from 01-01-1979 to 31-12-2019 and gives Dimensions: longitude: 1440 latitude: 721 time: 359400. After subsetiing for only month of May each year and for only study region using

dm = ds.isel(time=(ds.time.dt.month == 5)).sel(longitude=slice(65,90),latitude = slice(35,8)).where(land_mask == 0)

gives Dimensions: longitude: 101 latitude: 109 time: 30504 and land_mask uses masking of ocean datapoints using regionmask.

Now taking the

dout = dm.resample(time='D').mean(dim="time").dropna(dim='time')

Gives dimension as Dimensions: time: 0 latitude: 109 longitude: 101 The output should have been time:1271 i.e. number of days in all the 41 May that comes in between the dataset.(41*31=1271). I dont get any reason why the time dimension is null instead of 1271. NOTE: If I skip isel,sel and regionmask operation this works out fine.(see this previous question

I cant come up with a reason here.

Yash_U
  • 1
  • 2

1 Answers1

0

See the .dropna docs:

dim (Hashable) – Dimension along which to drop missing values. Dropping along multiple dimensions simultaneously is not yet supported.

how ({"any", "all"}, default: "any") –

  • any : if any NA values are present, drop that label
  • all : if all values are NA, drop that label

Using .dropna(dim="time") with a spatially masked dataset will always drop all time points. If you want to only drop observations in which all pixels are nan, use how='all'

Michael Delgado
  • 13,789
  • 3
  • 29
  • 54