I have the following Vaex function I am trying to make:
@vaex.register_function(on_expression=True)
def getSumStatsByGroup(df, group, x):
data = (df.groupby(by=group, agg={'Min' : vaex.agg.min(df[x]), 'Mean' : vaex.agg.mean(df[x]), 'Max' : vaex.agg.max(df[x]),
'Variance' : vaex.agg.var(df[x])}))
return data
Although every time I run it I get really messy data resembling this:
File <unknown>:2
0 AR 2020-12-06 00:00:00.000000000 AR Argentina AR ARG 0 2176.0 150.0 1489103.0 43125.0 3699476.0 nan nan nan nan nan nan 44938712.0 19523766.0 20593330.0 3599141.0 41339571.0 16.515 0.825
Although, when I manually fill in the parameters :
df.groupby(by='country_name', agg={'Min' : vaex.agg.min(df['new_confirmed']), 'Mean' : vaex.agg.mean(df['new_confirmed']), 'Max' : vaex.agg.max(df['new_confirmed']),
'Variance' : vaex.agg.var(df['new_confirmed'])})
The output is as expected. I have tried converting the return value to a pandas dataframe, calling print() on it, changing it to on_expression=False, getting rid of the return keyword in the function, but each time I get the exact same result. I am running this on jupyter notebook and very confused why it works when manually filling in parameters but not with the Vaex function. Any help or explanation is greatly appriciated!