I have a pandas dataframe containing the below data, and i would like to add a new column which, for each date, returns the most frequently occurring 'weather_type' over the previous 3 days. Where the result is a tie, i'd like the most recent 'weather_type' to be returned.
d = {'date': ['17/02/2017', '18/02/2017', '19/02/2017', '20/02/2017',
'21/02/2017', '22/02/2017'], 'precipmm': [1, 0, 3, 2, 7, 8], 'weather_type':
['rain', 'sun', 'rain', 'snow', 'snow', 'rain']}
df = pd.DataFrame(data=d)
df['date']=pd.to_datetime(df['date'], format='%d/%m/%Y')
df['rollingsum_precipmm']=df['precipmm'].rolling(window=3).sum()
I've already managed to create a new column containing the sum of the total 'precipmm' over the last 3 days using the below:
df['rollingsum_precipmm']=df['precipmm'].rolling(window=3).sum()
I suspect the answer revolves around this, but as yet i've been unable to find a solution.
Any help much appreciated as always
Colin