4

I've been trying to parse the date of the form - Jul 07, 2018 to dd-mm-yyyy format for my financial time series project. But being new to Pandas, I am not able to do it the usual way i.e., using

I've tried:

dateparse = lambda dates: pd.datetime.strptime(dates, '%m/%d/%Y')
data = pd.read_csv('C:\\doc.csv', parse_dates=['date'], index_col='date',date_parser=dateparse)

Error is shown as:

ValueError: time data `Jul 07, 2019' does not match format '%m/%d/%Y'
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • [Here's](http://strftime.org/) a good source to find the right codes for your parsers in the future – Erfan Jul 27 '19 at 12:19

1 Answers1

4

In short: the format is %b %d, %Y

You need to change the format you specified in the dateparse:

dateparse = lambda dates: pd.datetime.strptime(dates, '%b %d, %Y')
data = pd.read_csv('C:\\doc.csv', parse_dates=['date'], index_col='date',date_parser=dateparse)

For example:

>>> datetime.strptime('Jul 07, 2018', '%b %d, %Y')
datetime.datetime(2018, 7, 7, 0, 0)
>>> datetime.strptime('Apr 07, 2018', '%b %d, %Y')
datetime.datetime(2018, 4, 7, 0, 0)
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • 1
    This has two issues. First, a named lambda and second, pandas comes with [`to_datetime`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html) which would not require the same implicit python loops and also accepts a format string – roganjosh Jul 27 '19 at 13:22
  • 1
    Also, [`read_csv`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html) itself allows for specification of dtypes for different columns and a date format string (though I tend to do it in two parts such as my initial comment). I guess the `date_parser` argument is for formats not covered by the regular datetime syntax – roganjosh Jul 27 '19 at 13:26