Disparity between result of numpy gradient applied directly and applied using xarray.apply_ufunc

Question

I'm trying to use xarray's apply_ufunc to wrap numpy's gradient function, in order to take gradients along one dimension. However, apply_ufunc is returning an array with a different shape to the one which using np.gradient directly returns:

import xarray as xr
import numpy as np

def wrapped_gradient(da, coord):
    """Finds the gradient along a given dimension of a dataarray."""

    dims_of_coord = da.coords[coord].dims
    if len(dims_of_coord) == 1:
        dim = dims_of_coord[0]
    else:
        raise ValueError('Coordinate ' + coord + ' has multiple dimensions: ' + str(dims_of_coord))

    coord_vals = da.coords[coord].values

    return xr.apply_ufunc(np.gradient, da, coord_vals, kwargs={'axis': -1},
                          input_core_dims=[[dim]], output_core_dims=[[dim]],
                          output_dtypes=[da.dtype])



# Test it out by comparing with applying np.gradient directly:
orig = xr.DataArray(np.random.randn(4, 3), coords={'x': [5, 7, 9, 11]}, dims=('x', 'y'))

expected = np.gradient(orig.values, np.array([5, 7, 9, 11]), axis=0)

actual = wrapped_gradient(orig, 'x').values

I want expected and actual to be the same, but instead they are different:

print(expected.shape)
> (4,3)
print(actual.shape)
> (3,4)

(expected and actual are also not just transposed versions of each other.) I'm confused as to why - my understanding of apply_ufunc was that the core dimensions are moved to the end, so that axis=-1 should always be supplied to the ufunc?

score 1 · Accepted Answer · answered Aug 05 '18 at 02:02

1

xr.apply_ufunc moves the input_core_dims to the last position. The dimension that gradient was computed along are moved to the last position, and therefore the resultant shape would be transposed compared with the result by np.gradient.

The problem is that in your script the coordinate is not considered in the apply_ufunc. I think you need to pass input_core_dims for all the inputs; in your case, those for da and coord_vals. Changing [[dim]] to [[dim], []] will compute correctly, i.e.,

return xr.apply_ufunc(np.gradient, da, coord_vals, kwargs={'axis': -1},
                      input_core_dims=[[dim], []], output_core_dims=[[dim]],
                      output_dtypes=[da.dtype])

BTW, I think xarray should raise an Error when input_core_dims does not match those expected for the inputs. I will raise an issue on Github.

answered Aug 05 '18 at 02:02

Keisuke FUJII

1,306
9
13

That fixed it, thank you! (and also writing `expected = np.gradient(orig.values.T, np.array([5, 7, 9, 11]), axis=-1)`). – ThomasNicholas Aug 07 '18 at 20:37
However, if I now try to use `dask='parallelized'` (chunking the input as `chunked = orig.chunk({'y': 1})` then I get an error: `ValueError: Chunks do not add up to shape. Got chunks=((1, 1, 1),), shape=(4,)` – ThomasNicholas Aug 07 '18 at 20:39
Turns out it actually needs to be `input_core_dims=[[dim], [dim]]`, to account for the fact that `coord_vals` is a 1D array aligned along the same dimension I want to take the gradient along. With this change then `dask='parallelized'` will work. – ThomasNicholas Aug 28 '18 at 22:49

Disparity between result of numpy gradient applied directly and applied using xarray.apply_ufunc

1 Answers1

Linked