I have a data frame in pandas with a DateTime index.
When grouping it with a time grouper: pd.Grouper(freq='360Min')
, how can I join this result back onto the original timestamp?
I.e. an equijoin timestamp=bucket will not work?
Is there a convenience function?
Should the asof
join be used?
Or would I manually have to extract the hours and then try to match it up?
example:
for a source of
import pandas as pd
df = pd.DataFrame(
{
"Publish date": [
pd.Timestamp("2000-01-02"),
pd.Timestamp("2000-01-02"),
pd.Timestamp("2000-01-09"),
pd.Timestamp("2000-01-16")
],
"ID": [0, 1, 2, 3],
"Price": [10, 20, 30, 40]
}
)
Which gives:
Publish date ID Price
0 2000-01-02 0 10
1 2000-01-02 1 20
2 2000-01-09 2 30
3 2000-01-16 3 40
I want to perform an aggregation with an arbitrary frequency (not only month, day, hour), let's say 1
month.
agg_result = df.groupby(pd.Grouper(key="Publish date", freq="1M")).agg([pd.Series.mean, pd.Series.median]).reset_index()
agg_result.columns = ['_'.join(col).strip() for col in agg_result.columns.values]
agg_result.columns = ['Publish date month', 'ID_mean', 'ID_median', 'Price_mean', 'Price_median']
print(agg_result)
Publish date month ID_mean ID_median Price_mean Price_median
0 2000-01-31 1.5 1.5 25 25
How can I ensure that the equijoin would work again? I.e. transform the original timestamp into the fitting bucket using the same arbitrary frequency?
I.e. described in the code of the example, how can I get:
agg_result['Publish date month'] = agg_result['Publish date'].apply(magic transform to same frequency bucket)
df.merge(agg_result, on['Publish date month'])
To work, i.e. define the transformation to the right bucket?