0

I have the following DataFrame. For each row, I want the average purchase price for the user prior to that purchase. How should the code be written?

 userid product price   website date
0   123 xt23      20    Amazon  2018-01-12
1   123 q45        2    Amazon  2018-01-08
2   123 a89       25    NewEgg  2018-01-06
3   77  q45        3    NewEgg  2018-01-03
4   77  a89       30    NewEgg  2018-01-05
5   92  xt23      24    Amazon  2018-01-04
6   92  m33       60    Amazon  2018-01-07
7   92  a89       28    Amazon  2018-01-02 
DDM
  • 303
  • 4
  • 19

1 Answers1

0

You can do:

# convert to date
df['date'] = pd.to_datetime(df['date'])

# order by group and time
df = df.sort_values(['userid', 'date'])

# calculate average
df['avg_price'] = df.groupby('userid')['price'].transform(lambda x: x.shift().expanding().mean())

print(df)

   userid product  price website       date  avg_price
3      77     q45      3  NewEgg 2018-01-03        NaN
4      77     a89     30  NewEgg 2018-01-05        3.0
7      92     a89     28  Amazon 2018-01-02        NaN
5      92    xt23     24  Amazon 2018-01-04       28.0
6      92     m33     60  Amazon 2018-01-07       26.0
2     123     a89     25  NewEgg 2018-01-06        NaN
1     123     q45      2  Amazon 2018-01-08       25.0
0     123    xt23     20  Amazon 2018-01-12       13.5
YOLO
  • 20,181
  • 5
  • 20
  • 40