-1

I am trying to use a the datetimeindex of a pandas dataframe to assign a new column called 'season'.

winter =[12,1,2]
spring =[3,4,5]
summer =[6,7,8]
autumn =[9,10,11]

DTX_index = [datetime(2017, 2, 1).date(), datetime(2017, 3, 1).date(), datetime(2017, 6, 1).date(), datetime(2017, 9, 1).date()]
DTX_index = pd.to_datetime(DTX_index, utc=True)
df = pd.DataFrame(index=DTX_index)

I'm hoping for something like this:

                           season
2017-02-01 00:00:00+00:00   winter
2017-03-01 00:00:00+00:00   spring
2017-06-01 00:00:00+00:00   summer
2017-09-01 00:00:00+00:00   autumn

assign a month

df['month'] = df.index.month

assign boolean for a single season

df['season'] = df.index.month.isin([12,1,2])

I'm not sure how to assign season based on month over the whole df? I tried an apply function:

def add_season(x):

    if x.index.month.isin([12,1,2]):
        return 'winter'
    elif x.index.month.isin([3,4,5]):
        return 'spring'
    elif x.index.month.isin([6,7,8]):
        return 'summer'
    elif x.index.month.isin([9,10,11]):
        return 'autumn'

df['season'] = df.apply(add_season)

But this returns a value error:

ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index season')

presumably because the function is operating on a whole series rather than element wise.

I'm sure someone who has a bit more experience with apply functions than myself could fix this pretty quickly?

Many thanks

user3062260
  • 1,584
  • 4
  • 25
  • 53

2 Answers2

3

IIUC

d={**dict.fromkeys(winter,'winter'),**dict.fromkeys(spring,'spring'),**dict.fromkeys(summer,'summer'),**dict.fromkeys(autumn,'autumn')}
df['Value']=list(map(d.get,df.index.month))
df
Out[697]: 
                            Value
2017-02-01 00:00:00+00:00  winter
2017-03-01 00:00:00+00:00  spring
2017-06-01 00:00:00+00:00  summer
2017-09-01 00:00:00+00:00  autumn
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 1
    I haven't seen this way to build a dict before so thanks for posting! - Its just slightly less readable than the other answer so I've gone with that, thanks though! – user3062260 Apr 29 '19 at 16:33
2

You can create a mapping frame and use map. For this to work correctly, seasons should contain distinct months.


u = pd.DataFrame().assign(
    winter=winter, spring=spring, summer=summer, autumn=autumn
).melt().set_index('value')

df.assign(month=df.index.month.map(u.variable))

                            month
2017-02-01 00:00:00+00:00  winter
2017-03-01 00:00:00+00:00  spring
2017-06-01 00:00:00+00:00  summer
2017-09-01 00:00:00+00:00  autumn
user3483203
  • 50,081
  • 9
  • 65
  • 94