1

Under both Python 2.7 and 3.3.5, I am trying to parse a string to date.

>>> from dateutil import parser
>>> parser.parse("On")
datetime.datetime(2014, 12, 25, 0, 0)

Today is 2014-12-25, so it appears the result is today when parsing "On" (case insensitive). Is this behavior correct?

In fact, I would like this parse to raise an exception, as I don't think "On" is a valid date. How should I "correct" the behavior to my expectation? I mean not checking the input as "On", because I don't know if any other string like 'On' will surprise me again.

For some special case, even set 'fuzzy=False', the parse returns today without an exception. For example:

>>> parser.parse("' '", fuzzy=False)
datetime.datetime(2014, 12, 25, 0, 0)

Based on the feedback, it seems a possible workaround can be given a rarely used default date. Compare the result to see if parse is success or not.

>>> parser.parse("' '", fuzzy=False, default=datetime(1900,1,1))
bobyuan
  • 368
  • 2
  • 7
  • 2
    I don't think this is a duplicate. The question never mentions setting `fuzzy` to `True`. Since the default is `False`, if `"On"` were invalid, an exception would be thrown. This can only mean that `dateutil.parser` thinks `"On"` is a valid date string. Why that's the case will probably require looking into the source code. – huu Dec 25 '14 at 06:53
  • @lpapp let's reopen, but the topic is very close to that one anyway. The solution would be (referring to the [potential duplicate](http://stackoverflow.com/questions/12960614/trouble-in-parsing-date-using-dateutil)) to check if the returned value is a current date - if yes - treat it as a parsing error. Not ideal, but setting the default to anything except a date would result into an error also, `None` would not work, since `dateutil` would use the current date as a default in this case. Thanks. – alecxe Dec 25 '14 at 08:35

1 Answers1

1

"on" is in dateutil.parser.parserinfo.JUMP. When parser of dateutil parses a timestr it will check whether the component of the timestr is in the JUMP list, and the only component of "on" is "on" so it's in the JUMP list and the default datetime is used.

When checking "on" is lowered, so it's case insensitive.

maralla
  • 2,481
  • 2
  • 12
  • 10