1

I have a data frame like this...

Customer Date Balance
Adam 01/01/2022 10
Adam 01/01/2022 20
Adam 03/01/2022 30
Molly 01/01/2022 40
Molly 01/01/2022 50
Molly 03/01/2022 60

I want to find daily day-end balance for each customer. I tried groupby with resample method but couldn't do any. For both customer 2nd of January is missing, so my function should create a row for 2nd January and populat it with previous day end balance. My output should be like this-

Customer Date Balance
Adam 01/01/2022 20
Adam 02/01/2022 20
Adam 03/01/2022 30
Molly 01/01/2022 50
Molly 02/01/2022 50
Molly 03/01/2022 60
  • 1
    IIUC, `df.groupby(['Customer', 'Date']).transform('last')` should do the trick? – timgeb Mar 07 '22 at 10:46
  • Unfortunately, it won't work. Because the main data frame has no date with 2nd January. I have to create a row for 2nd January and take day end balance from 1st January. – Firoj Kawser Jubayer Mar 07 '22 at 11:00

1 Answers1

1

IIUC, you could do:

df2 = (
 df.assign(Date=pd.to_datetime(df['Date'], dayfirst=True))
   .groupby(['Customer', 'Date'], sort=False).last()
)

dt = df2.index.get_level_values('Date')
idx = pd.MultiIndex.from_product([df2.index.get_level_values('Customer').unique(),
                                  pd.date_range(dt.min(), dt.max(), freq='D')
                                 ], names=['Customer', 'Date'])

df2 = df2.reindex(idx).groupby(level='Customer', sort=False).ffill().reset_index()

output:

  Customer       Date  Balance
0     Adam 2022-01-01     20.0
1     Adam 2022-01-02     20.0
2     Adam 2022-01-03     30.0
3    Molly 2022-01-01     50.0
4    Molly 2022-01-02     50.0
5    Molly 2022-01-03     60.0
6     John 2022-01-01      NaN
7     John 2022-01-02     15.0
8     John 2022-01-03     15.0
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Hey, thanks for your answer. Your code almost works except for one thing. My data has many customers that start from 3rd January. So typically 1st and 2nd January should be blank. But it takes values from the previous customer balance. But after 3rd January it takes the correct value. So basically if you add another customer and start from 3rd January, it will take 60 from Molly for 1st and 2nd Jan with your code. It should be blank. Thank you. – Firoj Kawser Jubayer Mar 07 '22 at 18:40
  • @Firoj good point, then just `ffill` per group. See update ;) – mozway Mar 07 '22 at 18:54