8

I have the following string input : 24052017. When I try to do:

>>>dateutil.parser.parse("24052017")

It tells me that month must be in 1..12.

I have even tried doing:

>>>dateutil.parser.parse("24052017", firstday=True)

It gives me exactly th same outcome.

What seems to happening is that it does not like the fact that there are no spaces or separators. it reads the day correctly, but when it comes to the month it reads 0520. This is what I suspect, at least.

How to I convert this specific input using dateutil.parser, without manipulating the string?

Renier
  • 1,523
  • 4
  • 32
  • 60

5 Answers5

10

This format is currently not supported by dateutil. In general, if you know the format of your date and it does not have time zones, you should just use datetime.datetime.strptime to parse your dates, as dateutil.parser.parse has a considerable amount of overhead that it uses trying to figure out what format your date is in, and, critically, it may get that format wrong.

There is a pull request against the 2.6.0 branch that is under debate to add this format, you can find it here, ondateutil's github. The main argument against this would be that if you are trying to parse a series of dates, it will interpret 12052017 as "December 5, 2017", but 13052017 as "May 13, 2017". (That said, you do have the same inconsistency now in that the first date will parse to December 5, 2017, but the second date will simply fail).

If you do not know the format of the string, but you know that if it is an 8-digit numerical date you want it to be interpreted as DDMMYYYY, for now your best bet is to hard-code that exception into your parser:

from dateutil.parser import parse as duparse
from datetime import datetime

def parse(dtstr, *args, **kwargs):
    if len(dtstr) == 8 and dtstr.isnumeric():
        return datetime.strptime(dtstr, '%d%m%Y')
    else:
        return duparse(dtstr, *args, **kwargs)

There is some slow-moving planned effort to provide a more flexible and extensible parser for dateutil, but not much work has been done on this yet.

Paul
  • 10,381
  • 13
  • 48
  • 86
8

If you're not precious about using dateutil, you could do this with datetime.datetime.strptime:

from datetime import datetime

print datetime.strptime("24052017", '%d%m%Y')

This returns (in yyyy-mm-dd hh:mm:ss)

2017-05-24 00:00:00
asongtoruin
  • 9,794
  • 3
  • 36
  • 47
  • Thanks for your answer. I do know that I can do it that way, however, I would like to know if there is a way to do it using `dateutil.parser` :) – Renier Jun 02 '17 at 13:42
1

Well, dateutil.parser.parse needs some hints about date format you're trying to parse; in lack of such hints it assumes YYYYMMDD format, so your input becomes equivalent to 2405-20-17; either rearrange your string to read 20170524, for example like this dateutil.parser.parse(d[4:8]+d[2:4]+d[0:2]), or use separators: dateutil.parser.parse("24.05.2017") will work (however, the former method is preferred, due to ambiguity of the latter).

Błotosmętek
  • 12,717
  • 19
  • 29
1

You should be using datetime library as mentioned in the asongtoruin' answer. But if you want to achieve this using the dateutil.parser, you have to firstly convert your string to the format understandable to dateutil. Below is the example:

>>> d_string = "24052017"

#                                                    to consider day before month v
>>> dateutil.parser.parse('/'.join([d_string[:2], d_string[2:4],d_string[4:]]), dayfirst=True)
datetime.datetime(2017, 5, 24, 0, 0)

Here I am converting "24052017" to "24/05/2017" before it is passed to the dateutil.parser.parse(...).

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
0

You cannot use dateutil.parser.parse without manipulating the string.

import dateutil.parser

parserinfo = dateutil.parser.parserinfo(dayfirst=True, yearfirst=False)
print dateutil.parser.parse("24052017", parserinfo)

> Traceback (most recent call last):
> File "python", line 4, in <module>
> ValueError: month must be in 1..12

http://dateutil.readthedocs.io/en/stable/parser.html#dateutil.parser.parserinfo

Inside parserinfo, the JUMP is an array of separator.

# m from a.m/p.m, t from ISO T separator
JUMP = [" ", ".", ",", ";", "-", "/", "'",
        "at", "on", "and", "ad", "m", "t", "of",
        "st", "nd", "rd", "th"]

The empty string is not part of it.

M07
  • 1,060
  • 1
  • 14
  • 23
  • I just remind you that the question was "How to I convert this specific input using dateutil.parser, without manipulating the string?" Everybody is manipulating the string. – M07 Jun 02 '17 at 14:32
  • Your answer is not especially helpful, but more problematic is that your reasoning is wrong. If the values are not separated they count as one token, if the token is 8 digits the parser attempts to determine if it is `YYYYMMDD` or `MMDDYYYY`, but does not check for `DDMMYYYY`. My answer and asongtoruin's answer both give alternative approaches that do not manipulate the string. – Paul Jun 02 '17 at 23:12
  • Your alternative is to use datetime.strptime in some cases... So, Why not use datetime.strptime like asongtoruin has suggested? Except to have more complex solution. Nobody has provided an one line answer with dateutil.parser method because there is no solution. My response is the only one correct, an alternative to fix the problematic was already provided by asongtoruin. – M07 Jun 03 '17 at 15:28
  • My explanation clearly explains *why* this is not possible in dateutil. My main point is that your explanation is both incorrect (the reasoning is wrong) and unhelpful (does not provide an alternative way to achieve this result), which is likely why it is not well-received. – Paul Jun 04 '17 at 13:34
  • FWIW, the reason to use `strptime` selectively is that `dateutil.parser.parse` is frequently used on a loop of dates in mixed formats. If you know that some dates are `DDMMYYYY` but want to use `dateutil` to infer the format of other dates, my code is designed to provide fallback to `dateutil` when the date is in an unexpected format. – Paul Jun 04 '17 at 13:39