0

I am using dateutils.parser.parse to parse date strings which might contain partial information. If some information is not present, parse can take a default keyword argument from which it will fill any missing fields. This default defaults to datetime.datetime.today().

For a case like dateutil.parser.parse("Thursday"), this means it will return the date of the next Thursday. However, I need it to return the date of the last Thursday (including today, if today happens to be a Thursday).

So, assuming today == datetime.datetime(2018, 2, 20) (a Tuesday), I would like to get all of these asserts to be true:

from dateutil import parser
from datetime import datetime

def parse(date_str, default=None):
    # this needs to be modified
    return parser.parse(date_str, default=default)

today = datetime(2018, 2, 20)

assert parse("Tuesday", default=today) == today    # True
assert parse("Thursday", default=today) == datetime(2018, 2, 15)    # False
assert parse("Jan 31", default=today) == datetime(2018, 1, 31)    # True
assert parse("December 10", default=today) == datetime(2017, 12, 10)    # False

Is there an easy way to achieve this? With the current parse function only the first and third assert would pass.

Graipher
  • 6,891
  • 27
  • 47
  • It doesn't default to last or next, it just replaces the components of the default with the ones it finds in the string – Paul Feb 20 '18 at 11:54
  • @Paul You are correct, will re-word. – Graipher Feb 20 '18 at 11:55
  • 1
    Compare if that weekday(date) have passed or not, if not passed, minus a weekday(a year). – Page David Feb 20 '18 at 11:56
  • @Paul Hm, `dateutil.parser.parse("Thursday", default=datetime.datetime.(2018, 12, 31) == datetime.datetime.(2019, 1, 3)`, though. – Graipher Feb 20 '18 at 11:57
  • @Graipher Thursday with 2018/2/20 was parsed to 2018/2/22, 2018/12/31 was parsed to 2019/1/3. Both of the results are the following day Thursday of the given day, is there anything looks strange? – Page David Feb 20 '18 at 12:08
  • So you want to get last occurrence of all the date ? – Vikas Periyadath Feb 20 '18 at 12:23
  • @DavidPage Well, that is just the behavior of `datutil.parser.parse`, but this is not the one I need. I would need them to parse to 2018/02/15 and 2018/12/27. – Graipher Feb 20 '18 at 12:40
  • @VikasDamodar I want `"Thursday"` to parse to the date of the last Thursday (with respect to some reference), `"Dec 10"` to be the last day with that date (regardless if it was this or last year) and so on. – Graipher Feb 20 '18 at 13:06

1 Answers1

2

Here's your modified code (code.py):

#!/usr/bin/env python3

import sys
from dateutil import parser
from datetime import datetime, timedelta


today = datetime(2018, 2, 20)

data = [
    ("Tuesday", today, today),
    ("Thursday", datetime(2018, 2, 15), today),
    ("Jan 31", datetime(2018, 1, 31), today),
    ("December 10", datetime(2017, 12, 10), today),
]


def parse(date_str, default=None):
    # this needs to be modified
    return parser.parse(date_str, default=default)


def _days_in_year(year):
    try:
        datetime(year, 2, 29)
    except ValueError:
        return 365
    return 366


def parse2(date_str, default=None):
    dt = parser.parse(date_str, default=default)
    if default is not None:
        weekday_strs = [day_str.lower() for day_tuple in parser.parserinfo.WEEKDAYS for day_str in day_tuple]
        if date_str.lower() in weekday_strs:
            if dt.weekday() > default.weekday():
                dt -= timedelta(days=7)
        else:
            if (dt.month > today.month) or ((dt.month == today.month) and (dt.day > today.day)):
                dt -= timedelta(days=_days_in_year(dt.year))
    return dt


def print_stats(parse_func):
    print("\nPrinting stats for \"{:s}\"".format(parse_func.__name__))
    for triple in data:
        d = parse_func(triple[0], default=triple[2])
        print("  [{:s}] [{:s}] [{:s}] [{:s}]".format(triple[0], str(d), str(triple[1]), "True" if d == triple[1] else "False"))


if __name__ == "__main__":
    print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    print_stats(parse)
    print_stats(parse2)

Notes:

  • I changed the structure of the code "a bit", to parametrize it, so if a change is needed (e.g. a new example to be added) the changes should be minimal
    • Instead of asserts, I added a function (print_stats) that prints the results (instead raising AssertError and exiting the program if things don't match)
      • Takes an argument (parse_func) which is a function that does the parsing (e.g. parse)
      • Uses some globally declared data (data) together with the (above) function
    • data - is a list of triples, where each triple contains:
      1. Text to be converted
      2. Expected datetime ([Python 3.Docs]: datetime Objects) to be yielded by the conversion
      3. default argument to be passed to the parsing function (parse_func)
  • parse2 function (an improved version of parse):

    • Accepts 2 types of date strings:
      1. Weekday name
      2. Month / Day (unordered)
    • Does the regular parsing, and if the converted object comes after the one passed as the default argument (that is determined by comparing the appropriate attributes of the 2 objects), it subtracts a period (take a look at [Python 3.Docs]: timedelta Objects):
      1. "Thursday" comes after "Tuesday", so it subtracts the number of days in a week (7)
      2. "December 10" comes after "February 20", so it subtracts the number of days in the year*
    • weekday_strs: I'd better explain it by example:

      >>> parser.parserinfo.WEEKDAYS
      [('Mon', 'Monday'), ('Tue', 'Tuesday'), ('Wed', 'Wednesday'), ('Thu', 'Thursday'), ('Fri', 'Friday'), ('Sat', 'Saturday'), ('Sun', 'Sunday')]
      >>> [day_str.lower() for day_tuple in parser.parserinfo.WEEKDAYS for day_str in day_tuple]
      ['mon', 'monday', 'tue', 'tuesday', 'wed', 'wednesday', 'thu', 'thursday', 'fri', 'friday', 'sat', 'saturday', 'sun', 'sunday']
      
      • Flattens parser.parserinfo.WEEKDAYS
      • Converts strings to lowercase (for simplifying comparisons)
  • _days_in_year* - as you probably guessed, returns the number of days in an year (couldn't simply subtract 365 because leap years might mess things up):
    >>> dt = datetime(2018, 3, 1)
    >>> dt
    datetime.datetime(2018, 3, 1, 0, 0)
    >>> dt - timedelta(365)
    datetime.datetime(2017, 3, 1, 0, 0)
    >>> dt = datetime(2016, 3, 1)
    >>> dt
    datetime.datetime(2016, 3, 1, 0, 0)
    >>> dt - timedelta(365)
    datetime.datetime(2015, 3, 2, 0, 0)
    

Output:

(py35x64_test) E:\Work\Dev\StackOverflow\q048884480>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" code.py
Python 3.5.4 (v3.5.4:3f56838, Aug  8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)] on win32


Printing stats for "parse"
  [Tuesday] [2018-02-20 00:00:00] [2018-02-20 00:00:00] [True]
  [Thursday] [2018-02-22 00:00:00] [2018-02-15 00:00:00] [False]
  [Jan 31] [2018-01-31 00:00:00] [2018-01-31 00:00:00] [True]
  [December 10] [2018-12-10 00:00:00] [2017-12-10 00:00:00] [False]

Printing stats for "parse2"
  [Tuesday] [2018-02-20 00:00:00] [2018-02-20 00:00:00] [True]
  [Thursday] [2018-02-15 00:00:00] [2018-02-15 00:00:00] [True]
  [Jan 31] [2018-01-31 00:00:00] [2018-01-31 00:00:00] [True]
  [December 10] [2017-12-10 00:00:00] [2017-12-10 00:00:00] [True]
CristiFati
  • 38,250
  • 9
  • 50
  • 87