Goal: I read measurement data from a .csv and convert them to a dataframe. Then I add the date information from the filename to the time string which is already in the dataframe. And the last step is to convert this string with date and time informatin into a datetime object.
First steps that worked:
import pandas as pd
filename = '2022_02_14_data_0.csv
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1], encoding = 'ISO-8859-1')
# add the date to the timestamp string
measData['Timestamp'] = filename[:11]+measData['Timestamp']
An object in the Dataframe measData['Timestamp'] has now exacty a string with the following pattern:
'2022_02_14_00:00:06'
Now I want to convert this string to datetime:
measData['Timestamp'] = pd.to_datetime(measData['Timestamp'], format= '%Y_%m_%d_%H:%M:%S')
This raises the error:
ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing
Why do I get this error and how can I avoid it? I am pretty shure that the format is correct.
Edit: I wrote a sample code which should do exactly the same, and it works:
filename = '2022_02_14_data_0.csv'
timestamps = {'Timestamp': ['00:00:00', '00:00:01', '00:00:04']}
testFrame = pd.DataFrame(timestamps)
testFrame['Timestamp'] = testFrame['Timestamp']#
testFrame['Timestamp'] = filename[:11]+testFrame['Timestamp']
testFrame['Timestamp'] = pd.to_datetime(testFrame['Timestamp'], format= '%Y_%m_%d_%X')
My next step is now to check if all timestamp entries in the dataframe have the same format.
Solution: I do not understand the error but I found a working solution. Now I parse for the time already in the read_csv function and add the date information from the filename there. This works, measData(timeStamp) has now the datatype datetime64.
filename = '2022_02_14_data_0.csv'
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1],
parse_dates=[0], # parse for the time in the first column
date_parser = lambda col: pd.to_datetime(filename[:11]+col, format= '%Y_%m_%d_%X'),
encoding = 'ISO-8859-1')