0

I have a table as below and want to fill down the Stage of the same category based on the condition if Stage = "Delivered" then fill down "Delivered" to all the next rows else if Stage = "Paid" then fill down "Paid" to all the next rows

Category Date Stage
A 2021-11-01 Ordered
A 2021-12-01 Paid
A 2022-01-01
B 2021-08-01
B 2021-09-01 Ordered
B 2021-10-01 Paid
B 2021-11-01 Ordered
B 2021-12-01 Delivered

The result should look like:

Category Date Stage
A 2021-11-01 Ordered
A 2021-12-01 Paid
A 2022-01-01 Paid
B 2021-08-01
B 2021-09-01 Ordered
B 2021-10-01 Paid
B 2021-11-01 Paid
B 2021-12-01 Delivered

Could anyone help? I would really appreciate it!

anewone
  • 27
  • 4

1 Answers1

2

You can use mask and combine_first:

Assuming your dataframe is already sorted by Date column.

df['Stage'] = df['Stage'].mask(~df['Stage'].isin(['Paid', 'Delivered'])) \
                         .groupby(df['Category']).ffill() \
                         .combine_first(df['Stage'])
print(df)

# Output
  Category        Date      Stage
0        A  2021-11-01    Ordered
1        A  2021-12-01       Paid
2        A  2022-01-01       Paid
3        B  2021-08-01           
4        B  2021-09-01    Ordered
5        B  2021-10-01       Paid
6        B  2021-11-01       Paid
7        B  2021-12-01  Delivered
Corralien
  • 109,409
  • 8
  • 28
  • 52