How to find sales in previous n months using groupby

Question

I have a dataframe of daily sales:

import pandas as pd

date = ['28-01-2017','29-01-2017','30-01-2017','31-01-2017','01-02-2017','02-02-2017']
sales = [1,2,3,4,1,2]
ym = [201701,201701,201701,201701,201702,201702]
prev_1_ym = [201612,201612,201612,201612,201701,201701]
prev_2_ym = [201611,201611,201611,201611,201612,201612]

df_test = pd.DataFrame({'date': date,'ym':ym,'prev_1_ym':prev_1_ym,'prev_2_ym':prev_2_ym,'sales':sales})

df_test['date'] = pd.to_datetime(df_test['date'],format = '%d-%m-%Y')

I am trying to find total sales in the previous 1m, previous 2m etc..

My current approach is to use a list comprehension:

df_test[prev_1m_sales] = [ sum(df_test.loc[df_test['ym'] == x].sales) for x in df_test[prev_1_ym] ]

However, this proves to be very slow.

Is there a way to speed it up by using .groupby()?

Could you also try `s = df_test.groupby('ym')['sales'].sum()` and then `df_test['prev_1_ym'].map(s).fillna(0)` — anky, Feb 09 '20 at 15:56

score 0 · Answer 1 · answered Feb 09 '20 at 16:20

0

you can use the date column to group your data, first change its data-type to pandas TimeStamps,

df['dates']=pd.to_datetime(df['dates'])

then you can use it directly in grouping for example

df.groupby(df.data.month).sales.sum().cumsum()

answered Feb 09 '20 at 16:20

Mohammed Khalid

155
1
6

I am not looking for a cumulative sum - I am looking for a sum in the previous 1m, 2m, 3m etc. – morienor Feb 09 '20 at 19:31

How to find sales in previous n months using groupby

1 Answers1