4

Somewhat related to this post: dateutil parser for month/year format: return beginning of month

Given a date string of the form 'Sep-2020', dateutil.parser.parse correctly identifies the month and the year but adds the day as well. If a default is provided, it takes the day from it. Else, it will just use today's day. Is there anyway to tell if the parser used any of the default terms?

For example, how can I tell from the three options below that the input date string in the first case did not include day and that the default value was used?

>>> from datetime import datetime
>>> from dateutil import parser
>>> d = datetime(1978, 1, 1, 0, 0)
>>> parser.parse('Sep-2020', default=d)
datetime.datetime(2020, 9, 1, 0, 0)
>>> parser.parse('1-Sep-2020', default=d)
datetime.datetime(2020, 9, 1, 0, 0)
>>> parser.parse('Sep-1-2020', default=d)
datetime.datetime(2020, 9, 1, 0, 0)
``
vkkodali
  • 630
  • 7
  • 18

2 Answers2

4

I did something a little mad to solve this. It's mad since it's not guaranteed to work with future versions of dateutil (since it's relying on some dateutil internals).

Currently I'm using: python-dateutil 2.8.1.

I wrote my own class and passed it as default to the parser:

from datetime import datetime


class SentinelDateTime:

    def __init__(self, year=0, month=0, day=0, default=None):
        self._year = year
        self._month = month
        self._day = day

        if default is None:
            default = datetime.now().replace(
                hour=0, minute=0,
                second=0, microsecond=0
            )

        self.year = default.year
        self.month = default.month
        self.day = default.day
        self.default = default

    @property
    def has_year(self):
        return self._year != 0

    @property
    def has_month(self):
        return self._month != 0

    @property
    def has_day(self):
        return self._day != 0

    def todatetime(self):
        res = {
            attr: value
            for attr, value in [
                ("year", self._year),
                ("month", self._month),
                ("day", self._day),
            ] if value
        }
        return self.default.replace(**res)

    def replace(self, **result):
        return SentinelDateTime(**result, default=self.default)

    def __repr__(self):
        return "%s(%d, %d, %d)" % (
            self.__class__.__qualname__,
            self._year,
            self._month,
            self._day
        )

The dateutils method now returns this SentinelDateTime class:


>>> from dateutil import parser
>>> from datetime import datetime
>>> from snippet1 import SentinelDateTime
>>>
>>> sentinel = SentinelDateTime()
>>> s = parser.parse('Sep-2020', default=sentinel)
>>> s
SentinelDateTime(2020, 9, 0)
>>> s.has_day
False
>>> s.todatetime()
datetime.datetime(2020, 9, 9, 0, 0)


>>> d = datetime(1978, 1, 1)
>>> sentinel = SentinelDateTime(default=d)
>>> s = parser.parse('Sep-2020', default=sentinel)
>>> s
SentinelDateTime(2020, 9, 0)
>>> s.has_day
False
>>> s.todatetime()
datetime.datetime(2020, 9, 1, 0, 0)

I wrote this answer into a little package: https://github.com/foxyblue/sentinel-datetime

foxyblue
  • 2,859
  • 2
  • 21
  • 29
  • This fails when the date string has year only. For example: `print(parser.parse('2020'))` returns `2020-01-10 00:00:00` without a default but fails when I use `default=sentinel`. – vkkodali Jan 10 '21 at 13:57
  • Interesting, I've recreated this bug. I'll see if I can find a fix – foxyblue Jan 10 '21 at 14:00
  • Thank you @foxyblue but now I am get today's date for whichever date I try to parse. For example, `print(parser.parse('2020-09', default=sentinel))` and `print(parser.parse('2020', default=sentinel))` both return `SentinelDateTime(2021, 1, 10)` – vkkodali Jan 10 '21 at 14:19
  • hmm I'm not getting that, would you like to start an issue here so we can talk through this outside of these poorly formatted comments: https://github.com/foxyblue/sentinel-datetime/issues – foxyblue Jan 10 '21 at 14:23
  • ```SentinelDateTime(2020, 9, 0) SentinelDateTime(2020, 0, 0)``` This is my result – foxyblue Jan 10 '21 at 14:25
  • 1
    Thanks! I created an issue on github. – vkkodali Jan 10 '21 at 15:12
1

I have found a solution that's a little less complicated:

from datetime import datetime
from dataclasses import dataclass

from dateutil import parser


@dataclass
class Result:
    dt: datetime
    data: dict


class subparser(parser.parser):

    def _build_naive(self, res, default):
        naive = super()._build_naive(res, default)
        return Result(dt=naive, data=res)

In an example:

>>> PARSER = subparser()
>>> info = PARSER.parse("2020")
>>> info.data.year)
2020
>>> info.data.month
None
>>> info.dt
2020-01-10 00:00:00
foxyblue
  • 2,859
  • 2
  • 21
  • 29