I have a really messy file where there are datetime that I need to read and use it as an index. (I am adding this to clarify how it looks my data). My messy file were the datetime are located look like:
31.01.2016 13:59:13 31.01.2016 13:59:13 31.01.2016 14:39:20 31.01.2016 14:39:20 31.01.2016 15:19:27 31.01.2016 15:19:27 31.01.2016 15:59:34 31.01.2016 15:59:34 31.01.2016 16:39:41 31.01.2016 16:39:41 31.01.2016 17:19:48 31.01.2016 17:19:48 31.01.2016 17:59:55 31.01.2016 17:59:55 31.01.2016 18:40:02
The code I used for reading the file is the next (It is a really messy file):
df = pd.melt(pd.read_csv(file, delimiter='\t', skiprows=3, index_col=0, decimal=",").rename(columns={"Unnamed: 1":"Depth [m]"}), id_vars=["Depth [m]"], var_name="Date", value_name="Temperature").set_index("Date"))
The result that I have is a DataFrame. This DataFrame has a column which should be converted to DateTime (Date) and afterwards, I will use it as an index. When I am reading the DataFrame and try to convert the dates to a DateTime is not working. In this step, the Date is an "object". My Data looks like this:
Date Depth [m] Temperature
0 01.01.2016 00:32:09 0.0 26.59
1 01.01.2016 01:12:16 0.0 26.67
2 01.01.2016 01:52:23 0.0 26.45
3 01.01.2016 02:32:30 0.0 26.41
4 01.01.2016 03:12:37 0.0 26.43
When I apply the pd.to_datetime() with "coerce" occurs the next:
df["Date] = pd.to_datetime(df["Date], errors="coerce", format= "%d.%m.%Y %H:%M:%S" )
DatetimeIndex: 10001662 entries, 2016-01-01 00:32:09 to 2015-12-31 23:52:02
This can work but I get a lot of NAT values within
When I apply the pd.to_datetime() with "ignore" and check the info of the DF occurs the next:
df["Date] = pd.to_datetime(df["Date], errors="ignore", format= "%d.%m.%Y %H:%M:%S" )
Stills and object
Index: 10001662 entries, 01.01.2016 00:32:09 to 31.12.2015 23:52:02
When I apply the pd.to_datetime() with "raise" it is not working.
Is there any idea about what I am doing incorrectly?