0

I have a CSV where the first column is a day and a time and the remaining columns are measurements taken at those times. The dates are in the form "MM/DD/YYYY hh:mm:ss" in a single column. Python reads this as a string. How to I convert this to a date and time? Python doesn't know that '09/14/2016 23:00:00' comes right before '09/15/2016 0:00:00'

In MatLab, I can use

time = datenum(filename{:,1})

datetick('x','mmm-dd HH:MM:SS')

but I'm not sure what to use in Python. Is there a similar command in this language? I have tried using datetime, but I haven't much luck.

Thanks!

Note: I'm using Pandas to read the CSV

  • Welcome to SO! To make it easier to give a qualified answer, please add a [mre] (also check out [ask]). As it stands, the question is pretty unspecific. In general, if you use `pandas`, you can make simple plots using `DataFrame.plot()`, see [the docs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html). – FObersteiner Jun 09 '20 at 06:35
  • Hi there, Thank you. Sorry this wasn't clear. – Camille Woicekowski Jun 09 '20 at 19:26
  • I'm not having issues plotting, I'm having issues with Python reading dates sequentially. Dates are currently stored as a string and I need them as, well, dates. If I want to look at a period of time overnight it doesn't know that 09/14/2016 23:00 comes just before 09/15/2016 0:00. I will update my question to make this more clear. – Camille Woicekowski Jun 09 '20 at 19:32

1 Answers1

0

You will have to parse the dates (timestamps) from string to datetime dtype to have them sorted appropriately. If you use pandas.read_csv to load your csv to a DataFrame, the easiest thing to do is use the parse_dates keyword. Ex:

from  io import StringIO
import pandas as pd

s="""Timestamp,Value
06/01/2020 17:05:00,9506.01
06/01/2020 17:10:00,9513.44
06/01/2020 17:15:00,9521.56"""

df = pd.read_csv(StringIO(s), parse_dates=["Timestamp"])

for c in df.columns:
    print(f"{c} - {df[c].dtype}")
# Timestamp - datetime64[ns]
# Value - float64

Another option would be to convert from string to datetime after import using pandas.to_datetime:

df = pd.read_csv(StringIO(s))
df["Timestamp"].dtype
# dtype('O') # after import we only have strings...
df["Timestamp"] = pd.to_datetime(df["Timestamp"])

for c in df.columns:
    print(f"{c} - {df[c].dtype}")
# Timestamp - datetime64[ns]
# Value - float64
FObersteiner
  • 22,500
  • 8
  • 42
  • 72