2

I have the following data structure in a dataframe:

Date          Hour
01/01/2016    1
01/01/2016    2
01/01/2016    3
01/01/2016    4
   ...        ...
01/01/2016    13
01/01/2016    14
   ...        ...

The data is hourly data for 2 years. I managed to get the dates into a datetime format, but hours were parsed separatly and are in a separate column. How do I transform the hours into a datetime object and add them to dates, so I get the proper datetime structure:

Date          Hour     DateTime
01/01/2016    1        01/01/2016 01:00:00
01/01/2016    2        01/01/2016 02:00:00
01/01/2016    3        01/01/2016 03:00:00
01/01/2016    4        01/01/2016 04:00:00
   ...        ...             ...
01/01/2016    13       01/01/2016 13:00:00
01/01/2016    14       01/01/2016 14:00:00
   ...        ...             ...
timgeb
  • 76,762
  • 20
  • 123
  • 145
Andraxyz
  • 33
  • 6
  • Are the dates really in datetime format? Or are they strings? – timgeb Dec 13 '18 at 10:05
  • They were originally in the form of 01.01.2016. I used this line to transform them to datetime: df5['datum'] = pd.to_datetime(df5['datum'], dayfirst=True).dt.date – Andraxyz Dec 13 '18 at 10:08
  • Depending on whether the data covers the entire period of the nth hour of the day or a certain point in time, you have to consider how to represent this in your `DateTime`. As a European, I would suggest mapping Hour `1` to `00:00:00`, since `2016-01-01T24:00:00` describes the same point in time as `2016-01-02T00:00:00` (leaving aside leap seconds and such) – Raketenolli Dec 13 '18 at 10:11
  • Hour 1 represents the whole period of the nth hour, as the data is meant for the hourly production of electricity in kWh. – Andraxyz Dec 13 '18 at 10:15

3 Answers3

1

refer this one and make your own code You can use dt.strftime if you need to convert datetime to other formats (but note that then dtype of column will be object (string)):

import pandas as pd

df = pd.DataFrame({'DOB': {0: '26/1/2016 ', 1: '26/1/2016 '}})
print (df)

s DOB 0 26/1/2016 1 26/1/2016

df['DOB'] = pd.to_datetime(df.DOB)
print (df)

s DOB 0 2016-01-26 1 2016-01-26

df['DOB1'] = df['DOB'].dt.strftime('%m/%d/%Y')
print (df)

s DOB DOB1 0 2016-01-26 01/26/2016 1 2016-01-26 01/26/2016

1

Given

>>> df                                                                                                                 
         Date  Hour
0  01/01/2016     1
1  01/01/2016     2
2  01/01/2016     3
3  01/01/2016     4

I would issue

>>> df['Date'] = pd.to_datetime(df['Date']) # pick dayfirst=True or False in your code                                                                            
>>> df                                                                                                                 
        Date  Hour
0 2016-01-01     1
1 2016-01-01     2
2 2016-01-01     3
3 2016-01-01     4

and then construct the 'DateTime' column by vectorized addition of timedeltas:

>>> df['DateTime'] = df['Date'] + pd.to_timedelta(df['Hour'], 'h')                                                     
>>> df                                                                                                                 
        Date  Hour            DateTime
0 2016-01-01     1 2016-01-01 01:00:00
1 2016-01-01     2 2016-01-01 02:00:00
2 2016-01-01     3 2016-01-01 03:00:00
3 2016-01-01     4 2016-01-01 04:00:00
timgeb
  • 76,762
  • 20
  • 123
  • 145
0

Assuming your 2 columns 'Date' and 'Hour' are strings. But if not, this should take care of it. Takes the 2 string columns, joins them, then converts to datetime on the specific format:

if not is_string_dtype(df['Date']):
    df['Date'] = df['Date'].dt.strftime('%m/%d/%Y')
if not is_string_dtype(df['Hour']):
    df['Hour'] = df['Hour'].astype(str)

df['DateTime'] = df['Date'] + ' ' + df['Hour']
df['DateTime'] = pd.to_datetime(pd.Series(df['DateTime']), format="%m/%d/%Y %H")
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
chitown88
  • 27,527
  • 4
  • 30
  • 59