I would like to add a new column to my dataframe that contains the most recent 'revenue' value where 'promotion' == 1, excluding the current row. The dataframe will always be sorted by 'day' in descending order. For rows near the bottom of the dataframe where there is not a previous row with 'promotion' == 1, it should ideally return nothing (but this is not a hard requirement as we will generally be most interested in recent records).
So if this is my dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame({'day':[10, 9, 8, 7, 6, 5, 4, 3, 2, 1],
'revenue':[12000, 13000, 17000, 14000, 16000,
15000, 19000, 17000, 12000, 11000],
'promotion':[0, 0, 1, 1, 0, 0, 1, 0, 1, 0]})
The new column would look like:
df['last_promo_rev'] = [17000, 17000, 14000, 19000, 19000, 19000, 12000, 12000, np.nan, np.nan]