1

So, I am really new to python and pandas. I have a dataframe which looks like the following: enter image description here

So, after the Creation Date-Time column I want to add three more columns "Creation Time of Day", "Creation Day of Week", "Creation Month" which are based on the date and time values in the "Creation Date-Time" column.

For example, "Creation Time of Day" would contain values like "Morning", "Evening", etc, based on the time. "Creation Day of Week" would contain values like "Monday", "Tuesday", etc. and "Creation Month" would contain values like "January", "February", etc. How do I do this?

  • [Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael May 29 '20 at 10:10
  • What is logic for `Creation Time of Day` ? Can you be more specific? – jezrael May 29 '20 at 10:10
  • It can be useful as fixed effects in some time series analysis. He is likely looking to create some categorical variables that show if it is morning, mid-day or evening? – torkestativ May 29 '20 at 10:15
  • Yes that is what I am trying to do. Change the date-time into categorical values. –  May 29 '20 at 10:19

1 Answers1

0

Use:

df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'Creation Date-Time':pd.date_range('2015-01-02 15:07:01',periods=6,freq='231H')
                                 .strftime('%Y-%m-%dT%H:%M:%S'),
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
})
print (df)
   A  B   Creation Date-Time  D  E  F
0  a  4  2015-01-02T15:07:01  1  5  a
1  b  5  2015-01-12T06:07:01  3  3  a
2  c  4  2015-01-21T21:07:01  5  6  a
3  d  5  2015-01-31T12:07:01  7  9  b
4  e  5  2015-02-10T03:07:01  1  2  b
5  f  4  2015-02-19T18:07:01  0  4  b

First use to_datetime for datetimes, then get position of Creation Date-Time column by Index.get_loc, what is used for specify positions of new columns created by DataFrame.insert. For categoricals use cut, for names of days and months is used Series.dt.day_name and Series.dt.month_name:

df['Creation Date-Time'] = pd.to_datetime(df['Creation Date-Time'])

idx = df.columns.get_loc('Creation Date-Time')

#https://stackoverflow.com/a/55571425/2901002
b = [0,4,8,12,16,20,24]
l = ['Late Night', 'Early Morning','Morning','Noon','Eve','Night']
s = pd.cut(df['Creation Date-Time'].dt.hour, bins=b, labels=l)

df.insert(idx + 1, 'Creation Time of Day', s)
df.insert(idx + 2, 'Creation Time of Week', df['Creation Date-Time'].dt.day_name())
df.insert(idx + 3, 'Creation Month', df['Creation Date-Time'].dt.month_name())
print (df)

   A  B  Creation Date-Time Creation Time of Day Creation Time of Week  \
0  a  4 2015-01-02 15:07:01                 Noon                Friday   
1  b  5 2015-01-12 06:07:01        Early Morning                Monday   
2  c  4 2015-01-21 21:07:01                Night             Wednesday   
3  d  5 2015-01-31 12:07:01              Morning              Saturday   
4  e  5 2015-02-10 03:07:01           Late Night               Tuesday   
5  f  4 2015-02-19 18:07:01                  Eve              Thursday   

  Creation Month  D  E  F  
0        January  1  5  a  
1        January  3  3  a  
2        January  5  6  a  
3        January  7  9  b  
4       February  1  2  b  
5       February  0  4  b  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252