For every date date
I want to get the average of the Amount from every combination of Brand.
For example, I have a dataframe:
df1 =
Company Brand Date Amount
A 1 01/01/2015 3
A 1 01/02/2015 4
A 1 01/03/2015 2
A 2 01/01/2015 7
A 2 01/02/2015 2
A 2 01/03/2015 1
A 3 01/01/2015 6
A 3 01/02/2015 3
A 3 01/03/2015 1
and I want the result to be the following df, where the Amount is the average of the combined groups:
result =
Company Brand Date Amount
A 1 01/01/2015 3
A 1 01/02/2015 4
A 1 01/03/2015 2
A 2 01/01/2015 7
A 2 01/02/2015 2
A 2 01/03/2015 1
A 3 01/01/2015 6
A 3 01/02/2015 3
A 3 01/03/2015 1
A 1_2 01/01/2015 5
A 1_2 01/02/2015 3
A 1_2 01/03/2015 1.5
A 2_3 01/01/2015 6.5
A 2_3 01/02/2015 2.5
A 2_3 01/03/2015 1
A 1_3 01/01/2015 4.5
A 1_3 01/02/2015 3.5
A 1_3 01/03/2015 1.5
A 1_2_3 01/01/2015 5.33
A 1_2_3 01/02/2015 3
A 1_2_3 01/03/2015 1.33
Currently, I'm using a loop with a groupby to do this, but it is very slow.
d = pd.DataFrame()
comb = ['1_2','1_3','2_3','1_2_3']
for c in comb:
new = df1.loc[(df1.Brand.isin(map(int,c.split('_')))].groupby(['Company','Date'])['Amount'].mean().reset_index()
new.insert(1,'Group',c)
d = d.append(new)
df = df.append(d)
However, I'm working with a thousand unique companies and several million rows, so this is very slow. Is there a way to speed this up?