12

Here is my dataframe , I need to create a new column based on the timehour which the row value be like (morning, afternoon, evening, night)

DataFrame

Here is my code

if ((prods['hour'] < 4) & (prods['hour'] > 8 )):
    prods['session'] = 'Early Morning'
elif ((prods['hour'] < 8) & (prods['hour'] > 12 )):
    prods['session'] = 'Morning'
elif ((prods['hour'] < 12) & (prods['hour'] > 16 )):
    prods['session'] = 'Noon'
elif ((prods['hour'] < 16) & (prods['hour'] > 20 )):
    prods['session'] = 'Eve'
elif ((prods['hour'] < 20) & (prods['hour'] > 24 )):
    prods['session'] = 'Night'
elif ((prods['hour'] < 24) & (prods['hour'] > 4 )):
    prods['session'] = 'Late Night'

Here is the error i got

ValueError Traceback (most recent call last) in ----> 1 if (prods['hour'] > 4 and prods['hour']< 8): 2 prods['session'] = 'Early Morning' 3 elif (prods['hour'] > 8 and prods['hour'] < 12): 4 prods['session'] = 'Morning' 5 elif (prods['hour'] > 12 and prods['hour'] < 16):

/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in nonzero(self) 1476 raise ValueError("The truth value of a {0} is ambiguous. " 1477 "Use a.empty, a.bool(), a.item(), a.any() or a.all()." -> 1478 .format(self.class.name)) 1479 1480 bool = nonzero

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Kindly help

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Vijayaraghavan
  • 255
  • 1
  • 2
  • 14

2 Answers2

23

Use cut or custom function with and and also changed < to > and > to <= and also for each value add return:

prods = pd.DataFrame({'hour':range(1, 25)})

b = [0,4,8,12,16,20,24]
l = ['Late Night', 'Early Morning','Morning','Noon','Eve','Night']
prods['session'] = pd.cut(prods['hour'], bins=b, labels=l, include_lowest=True)

def f(x):
    if (x > 4) and (x <= 8):
        return 'Early Morning'
    elif (x > 8) and (x <= 12 ):
        return 'Morning'
    elif (x > 12) and (x <= 16):
        return'Noon'
    elif (x > 16) and (x <= 20) :
        return 'Eve'
    elif (x > 20) and (x <= 24):
        return'Night'
    elif (x <= 4):
        return'Late Night'

prods['session1'] = prods['hour'].apply(f)
print (prods)
    hour        session       session1
0      1     Late Night     Late Night
1      2     Late Night     Late Night
2      3     Late Night     Late Night
3      4     Late Night     Late Night
4      5  Early Morning  Early Morning
5      6  Early Morning  Early Morning
6      7  Early Morning  Early Morning
7      8  Early Morning  Early Morning
8      9        Morning        Morning
9     10        Morning        Morning
10    11        Morning        Morning
11    12        Morning        Morning
12    13           Noon           Noon
13    14           Noon           Noon
14    15           Noon           Noon
15    16           Noon           Noon
16    17            Eve            Eve
17    18            Eve            Eve
18    19            Eve            Eve
19    20            Eve            Eve
20    21          Night          Night
21    22          Night          Night
22    23          Night          Night
23    24          Night          Night
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks @jezrael its working fine. Kindly provide some learning material to improve myself – Vijayaraghavan Apr 08 '19 at 11:14
  • 1
    @Vijayaraghavan - [here](http://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html) are nice tutorials, I like modern pandas . – jezrael Apr 08 '19 at 11:15
19

After some research, this is the simplest and most efficient implementation I could find.

prods['period'] = (prods['hour_int'].dt.hour % 24 + 4) // 4
prods['period'].replace({1: 'Late Night',
                      2: 'Early Morning',
                      3: 'Morning',
                      4: 'Noon',
                      5: 'Evening',
                      6: 'Night'}, inplace=True)

I hope this helps.

Marcel Motta
  • 303
  • 2
  • 7