Edit: as of Polars 0.13.18, the apply method converts Numpy datatypes to Polars datatypes without requiring the Numpy item method.
Use apply
in a groupby context (rather than map
).
In this case, the numpy trapz function takes only one positional parameter (y)
numpy.trapz(y, x=None, dx=1.0, axis=- 1)
So, we'll need to specify the x keyword parameter explicitly in our call. (I also assumed that you meant for your y
column to be mapped as the y
parameter, and your t
column to be mapped as the x
parameter in the call to numpy.)
The Series 'y' and 't' will be passed as a list of Series to the lambda function, so we'll use indices to indicate which column maps to which numpy parameter.
One additional wrinkle, numpy returns a value of type numpy.float64
, rather than a Python float.
type(np.trapz([0, 1, 1], x=[2, 4, 5]))
<class 'numpy.float64'>
Presently, the apply
function in Polars will not automatically convert a numpy.float64
to polars.Float64
. To remedy this, we'll use the numpy item
method to have numpy return a Python float, rather than a numpy.float64
.
type(np.trapz([0, 1, 1], x=[2, 4, 5]).item())
<class 'float'>
With this in hand, we can now write our apply
statement.
df.groupby("id").agg(
pl.apply(
["y", "t"],
lambda lst: np.trapz(y=lst[0], x=lst[1]).item()
)
)
shape: (2, 2)
┌─────┬──────┐
│ id ┆ y │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪══════╡
│ 1 ┆ 13.0 │
├╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 0 ┆ 2.0 │
└─────┴──────┘