1

I have an xarray DataArray which has coordinates lat, lon, time (and height but I have only one level). I am using stats to summarise the time series at each point and have a function to do this. I would like the function to return either a numpy array at each grid (lat/lon) point or a string (or write this directly to my file). The process is slow and therefore I am using dask in order to parallelise.

So far I have successfully applied u_func along all of the dimension of the DataArray to give one of the input parameters to this function. Using the function below:

ds['binsNew'] = xr.apply_ufunc(vectorized_fn1,ds['bins'], dask='allowed')

However the below function is not working as expected

fit_params = xr.apply_ufunc(vectorized_fn2_ts_stats, ds['WS'], ds['binsNew'],ds['lon'],ds['lat'],
                            input_core_dims=[['time'],['time'],[],[]],
                            vectorize=True,
                            dask='parallelized')

Yesterday it appeared to produce the array required within the function but not be able to store in the DataArray. However today within the function if I print the input variables, they are all incorrectly 1 and it appears to only be executing the function for one location on the grid.

Does anyone have any advice on whether this approach should work, or a better way to tackle the issue?

oben
  • 21
  • 2
  • Go through [how-to-ask](https://stackoverflow.com/help/how-to-ask) and [edit](https://stackoverflow.com/posts/76742673/edit) your post. What you have tried? your code? – shaik moeed Jul 22 '23 at 07:07
  • Sorry I pressed send two early. Edited now – oben Jul 22 '23 at 07:18
  • Your general approach seems okay, except that it seems unneccessary (and potentially slow) to use a function that can return multiple types (float vs string etc.) Just return floats or NaNs (which are a type of float) or something then worry about saving the results later. You may want to read the apply_ufuc material on the tutorial.xarray.dev wesbite. As for your error then without more info it's hard to help further. Clearly something must have changed to cause a change in the behavior of your code. – ThomasNicholas Jul 24 '23 at 13:48
  • Thank you @ThomasNicholas My thinking was to return a numpy array at each point that contains 56 values. I don't need multiple types, I just tried directly saving a string which contained those 56 values instead but it appeared that xarray didn't like string as in my test it only stored the first character so I reverted to saving the numpy array. Is this not possible? Would I need to save those 56 values as individual variables in my DataArray? The only reason I hadn't already is because the weibull fit returns 4 parameters and I need all of them to be saved in the array etc. – oben Jul 24 '23 at 16:27
  • Why 56? Is that the size of one of the dimensions of your input? Or a new dimension on the output? (Again a reproducible example would be helpful...) Don't store 56 variables in your DataArray, those should correspond to a dimension. You can save multiple variables in the output, and storing 4 new variables for the 4 parameters of the fit seems reasonable. Look on the tutorial site I mentioned above, there is an example showing how to return multiple variables. – ThomasNicholas Jul 25 '23 at 14:48

0 Answers0