2

Simple question, but can't wrap my head around the documentation:

Given two DataArrays, how can one use apply_ufunc such that the outputs of the functions are collected in a new dimension?

For example:

test1 = xr.DataArray(np.linspace(1, 6, 6).reshape(3, 2))
test2 = xr.DataArray(np.linspace(6, 1, 6).reshape(3, 2))

def foo(a, b):
    return a+b, a-b

xr.apply_ufunc(foo, test1, test2)

This returns an error: ValueError: applied function returned data with unexpected number of dimensions: 3 vs 2, for dimensions ('dim_0', 'dim_1')

Any ideas of how to do this?

Andrew Williams
  • 741
  • 8
  • 18
Raven
  • 648
  • 1
  • 7
  • 18
  • I've tried answering this, but am finding a similar error to you due to the reshape dimensions. It seems to be related to this question (https://stackoverflow.com/questions/51680659/disparity-between-result-of-numpy-gradient-applied-directly-and-applied-using-xa), in that "`xr.apply_ufunc` moves the `input_core_dims` to the last position" of the output. I'd be keen to see this answered though because the documentation currently seems a bit sparse! – Andrew Williams Mar 06 '20 at 21:08
  • Also, there is a similar example here (https://github.com/pydata/xarray/issues/1815) which now works, however the difference is that in the OP question the `.reshape()` operation adds in extra dimensions to the input DataArrays which you then broadcast through. – Andrew Williams Mar 06 '20 at 21:11

1 Answers1

1

This error is thrown due to the missing input_core_dims argument. Due to the reshape you have multiple dimensions:

<xarray.DataArray (dim_0: 3, dim_1: 2)>
array([[1., 2.],
       [3., 4.],
       [5., 6.]])
Dimensions without coordinates: dim_0, dim_1

You can see xarray named the dimensions for us but we need to tell these names to apply_ufunc as follows:

xr.apply_ufunc(
        foo, test1, test2,
        input_core_dims=[["dim_0","dim_1"],["dim_0","dim_1"]]
    )

Now we get back the tuple with both operations:

 (<xarray.DataArray (dim_0: 3, dim_1: 2)>
 array([[7., 7.],
        [7., 7.],
        [7., 7.]])
 Dimensions without coordinates: dim_0, dim_1,
 <xarray.DataArray (dim_0: 3, dim_1: 2)>
 array([[-5., -3.],
        [-1.,  1.],
        [ 3.,  5.]])
 Dimensions without coordinates: dim_0, dim_1)

FYI: If you have more inputs than outputs you also need to provide the kwarg output_core_dimensions following the same scheme as you would do for your input_core_dimensions. Both of these kwargs are core to understanding the functionality of apply_ufunc.

Handfeger
  • 684
  • 6
  • 17