0

I have a netCDF file which I have read with xarray. The array contains times, latidude, longitude and only one data variable (i.e. index values)

# read the netCDF files
    with xr.open_mfdataset('wet_tropics.nc') as wet:
    print(wet)

    Out[]: 
    <xarray.Dataset>
    Dimensions:       (time: 1437, x: 24, y: 20)
    Coordinates:
      * y             (y) float64 -1.878e+06 -1.878e+06 -1.878e+06 -1.878e+06 ...
      * x             (x) float64 1.468e+06 1.468e+06 1.468e+06 1.468e+06 ...
      * time          (time) object '2013-03-29T00:22:28.500000000' ...
    Data variables:
        index_values  (time, y, x) float64 dask.array<shape=(1437, 20, 24), chunksize=(1437, 20, 24)>

So far, so good. Now I need to apply a generalized additive model to each grid cell in the array. The model I want to use comes from Facebook Prophet (https://facebook.github.io/prophet/) and I have successfully applied it to a pandas array of data before. For example:

cns_ap['y'] = cns_ap['av_index']  # Prophet requires specific names 'y' and 'ds' for column names
cns_ap['ds'] = cns_ap['Date']
cns_ap['cap'] = 1
m1 = Prophet(weekly_seasonality=False,  # disables weekly_seasonality
             daily_seasonality=False,  # disables daily_seasonality
             growth='logistic',  # logistic because indices have a maximum 
             yearly_seasonality=4,  # fourier transform. int between 1-10
             changepoint_prior_scale=0.5).fit(cns_ap)  
future1 = m1.make_future_dataframe(periods=60,  # 5 year prediction
                                   freq='M',  # monthly predictions
                                   include_history=True)  # fits model to all historical data
future1['cap'] = 1  # sets cap at maximum index value
forecast1 = m1.predict(future1)
# m1.plot_components(forecast1, plot_cap=False);
# m1.plot(forecast1, plot_cap=False, ylabel='CNS index', xlabel='Year');

The problem is that now I have to 1)iterate through every cell of the netCDF file, 2)get all the values for that cell through time, 3)apply the GAM (using fbprophet), and then export and plot the results.

The question: do you have any ideas on how to loop through the raster, get the index_values of each pixel for all times so that i can run the GAM? I think that a nested for loop would be feasible, although i dont know how to make one that goes through every cell.

Any help is appreciated

Nicolas
  • 1
  • 1
  • Can you use the `to_pandas` method in xarray, and then group_by x and y to run the model on unique pixels? Alternatively I think you would have to use the `apply_ufunc` method in xarray. I did something similar [here](https://stackoverflow.com/questions/49959449/how-to-use-xr-apply-ufunc-with-changing-dimensions?noredirect=1#comment86937496_49959449), but your model is a bit more complicated since it returns a timeseries for each pixel. – Shawn May 21 '18 at 20:11
  • Thanks Shawn! I'll try the `xarray.apply_ufunc` and let you know. As you mention, the GAM poses quite a few challenges because it provides 2 outputs per pixel. I was thinking of storing them as separate variables in the same xarray or exporting them as netCDF files for later use. To use the `to_pandas` method i would still have to iterate through each pixel to group them, right? – Nicolas May 21 '18 at 22:20
  • This was presented to me today: we generally use something like `numpy.ndindex` to iterate xarrays You'd need to prevent it from iterating over the time dimension, like: ```for (x,y) in np.ndindex((wet.sizes['x'], wet.sizes['y'])): wet.index_values.isel(x=x, y=y)``` You could also do something like: ```flat = wet.index_values.rename({'time': 'ds'}).stack(point=('x', 'y')).rename('y') for i in range(flat.sizes['point']): df = flat.isel(point=1).to_dataframe() m.fit(df)``` – Nicolas May 23 '18 at 23:33
  • Dear Nicolas, see my other post about how to use FProphet over xarrays. It is in here [https://stackoverflow.com/questions/56626011/using-prophet-on-netcdf-file-using-xarray/58718179#58718179] . – Philipe Riskalla Leal Nov 05 '19 at 19:30

0 Answers0