I am trying to use a the datetimeindex of a pandas dataframe to assign a new column called 'season'.
winter =[12,1,2]
spring =[3,4,5]
summer =[6,7,8]
autumn =[9,10,11]
DTX_index = [datetime(2017, 2, 1).date(), datetime(2017, 3, 1).date(), datetime(2017, 6, 1).date(), datetime(2017, 9, 1).date()]
DTX_index = pd.to_datetime(DTX_index, utc=True)
df = pd.DataFrame(index=DTX_index)
I'm hoping for something like this:
season
2017-02-01 00:00:00+00:00 winter
2017-03-01 00:00:00+00:00 spring
2017-06-01 00:00:00+00:00 summer
2017-09-01 00:00:00+00:00 autumn
assign a month
df['month'] = df.index.month
assign boolean for a single season
df['season'] = df.index.month.isin([12,1,2])
I'm not sure how to assign season based on month over the whole df? I tried an apply function:
def add_season(x):
if x.index.month.isin([12,1,2]):
return 'winter'
elif x.index.month.isin([3,4,5]):
return 'spring'
elif x.index.month.isin([6,7,8]):
return 'summer'
elif x.index.month.isin([9,10,11]):
return 'autumn'
df['season'] = df.apply(add_season)
But this returns a value error:
ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index season')
presumably because the function is operating on a whole series rather than element wise.
I'm sure someone who has a bit more experience with apply functions than myself could fix this pretty quickly?
Many thanks