11

I have a column with timestamps

 start_time: 
 0    2016-06-04 05:18:49
 1    2016-06-04 06:50:12
 2    2016-06-04 08:16:02
 3    2016-06-04 15:05:13
 4    2016-06-04 15:24:25

I want use a function on the start_time column to round minutes >= 30 to the next hour.

 def extract_time(col):
      time = col.strftime('%H:%M')
      min= int(time.strip(':')[1])
      hour= int(time.strip(':')[0])
      if min >= 30:
           return hour + 1
      return hour

Then I want to create a new columns 'hour', with the rounded hours:

 df['hour'] = df['start_time'].apply(extract_time)

Instead of getting getting an 'hour' column with the rounded hours, I am getting the below:

 0    <function extract_hour at 0x128722b90>
 1    <function extract_hour at 0x128722b90>
 2    <function extract_hour at 0x128722b90>
 3    <function extract_hour at 0x128722b90>
 4    <function extract_hour at 0x128722b90>
EJSuh
  • 195
  • 1
  • 2
  • 11

2 Answers2

23

you can use the following vectorized solution:

In [30]: df['hour'] = df['start_time'].dt.round('H').dt.hour

In [31]: df
Out[31]:
           start_time  hour
0 2016-06-04 05:18:49     5
1 2016-06-04 06:50:12     7
2 2016-06-04 08:16:02     8
3 2016-06-04 15:05:13    15
4 2016-06-04 15:24:25    15
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
8

Try this:

df['start_time2'] = df['start_time'].dt.floor('h')

or even this:

df['start_time2'] = df['start_time'].apply(lambda x: x.replace(minute=0, second=0))
Mewtwo
  • 1,231
  • 2
  • 18
  • 38