I have a pandas dataframe df:
TypeA TypeB timepoint value
A AB 1 10
A AB 2 10
A AC 1 5
A AC 2 15
A AC 3 10
...
D DB 1 1
D DB 2 1
How can I run a function several times on a the unique combinations of 'TypeA' and 'TypeB' and store the results in a new dataframe? Let's assume the following function:
import numpy as np
def running_mean(x, N):
cumsum = np.cumsum(np.insert(x, 0, 0))
return (cumsum[N:] - cumsum[:-N]) / float(N)
Normally, I would do a for-loop
, but I think that is not a good idea (and I miss the savings of the functions):
df4 = pd.DataFrame()
for i in df['typeA'].unique().tolist():
df2 = df[df['typeA'] == i]
for j in df2['typeB'].unique().tolist():
df3 = df2[df2['typeB'] == j]
moving_av = running_mean(df3['Wert'].values, 2)
df3.iloc[1:1+len(moving_av), df3.columns.get_loc('moving_av')] = moving_av
df5 = pd.concat([df5, df3])
df = pd.merge(df, df5, how='left', on=['typeA', 'Type', 'Kontonummer', 'timepoint'])
My desired output is:
TypeA TypeB timepoint value moving_av
A AB 1 10 NaN
A AB 2 10 10
A AC 1 5 NaN
A AC 2 15 10
A AC 3 10 12.5
...
D DB 1 1 NaN
D DB 2 1 1
Please note that the simple 'sum' function is only a example, I am searching for a solution for a bigger function.