Split up datetime interval according to labeled partition of a week

Question

I have shift which is a datetime interval (a pair of datetimes). My weeks have a labeled partition (every week is the same: divided into parts, and each part has a label). I want to split up shift into labeled parts (i.e. into several subintervals), according to the partition of the week.

Example. Suppose shift is the interval 2019-10-21 18:30 - 2019-10-22 08:00, and the partition of the week is as follows: Monday to Friday 07:00 - 19:00 has label A, and the rest of the week has label B. In this case the splitting of shift should be the following list of labeled subintervals:

2019-10-21 18:30 - 2019-10-21 19:00 with label A,
2019-10-21 19:00 - 2019-10-22 07:00 with label B, and
2019-10-22 07:00 - 2019-10-22 08:00 with label A.

How do I do this in general?

Input: a datetime interval (pair), and a labeled partition of the week (how to best represent this?)

Output: a list of labeled datetime intervals (pairs).

Note that shift can start in one week and end in another week (e.g. Sunday evening to Monday morning); each week does have the same labeled partition.

Will Da Silva · Answer 1 · 2021-10-10T18:25:49.560

Here's a way to obtain the desired intervals:

from collections import namedtuple
from datetime import datetime, timedelta
import itertools as it


# Built-in as `it.pairwise` in Python 3.10+
def pairwise(iterable):
    it = iter(iterable)
    a = next(it, None)
    for b in it:
        yield (a, b)
        a = b


def beginning_of_week(d: datetime) -> datetime:
    ''' Returns the datetime object for the beginning of the week the provided day is in. '''
    return (d - timedelta(days=d.weekday())).replace(hour=0, minute=0, second=0, microsecond=0)


Partition = namedtuple('Partition', ('start', 'stop', 'label')) # output format


def _partition_shift_within_week(start: int, stop: int, partitions):
    ''' Splits the shift (defined by `start` and `stop`) into partitions within one week. '''
    # Get partitions as ranges of absolute offsets from the beginning of the week in seconds
    labels = (x for _, x in partitions)
    absolute_offsets = it.accumulate(int(x.total_seconds()) for x, _ in partitions)
    ranges = [range(x, y) for x, y in pairwise((0, *absolute_offsets))]
    first_part_idx = [start in x for x in ranges].index(True)
    last_part_idx = [stop in x for x in ranges].index(True)
    for r, label in zip((ranges[i] for i in range(first_part_idx, last_part_idx + 1)), labels):
        yield Partition(
            timedelta(seconds=max(r.start, start)), # start of subinterval
            timedelta(seconds=min(r.stop, stop)),   # end of the subinterval
            label
        )


def _partition_shift_unjoined(shift, partitions):
    ''' Partitions a shift across weeks with partitions unjoined at the week edges. '''
    start_monday = beginning_of_week(shift[0])
    stop_monday = beginning_of_week(shift[1])
    seconds_offsets = (
        int((shift[0] - start_monday).total_seconds()),
        *[604800] * ((stop_monday - start_monday).days // 7),
        int((shift[1] - stop_monday).total_seconds()),
    )
    for x, y in pairwise(seconds_offsets):
        num_weeks, x = divmod(x, 604800)
        for part in _partition_shift_within_week(x, y - (y == 604800), partitions):
            weeks_offset = timedelta(weeks=num_weeks)
            yield Partition(
                start_monday + weeks_offset + part.start,
                start_monday + weeks_offset + part.stop,
                part.label
            )


def partition_shift(shift, partitions):
    ''' Partitions a shift across weeks. '''
    results = []
    for part in _partition_shift_unjoined(shift, partitions):
        if len(results) and results[-1].label == part.label:
            results[-1] = Partition(results[-1].start, part.stop, part.label)
        else:
            results.append(part)
    return results

Usage example:

shift = (datetime(2019, 10, 21, 18, 30), datetime(2019, 10, 22, 8, 0))

# Partitions are stored as successive offsets from the beginning of the week
partitions = (
    (timedelta(hours=7), 'B'), # Monday morning (midnight to 07:00)
    (timedelta(hours=12), 'A'),
    (timedelta(hours=12), 'B'), # Monday night & Tuesday morning (til 07:00)
    (timedelta(hours=12), 'A'),
    (timedelta(hours=12), 'B'), # Tuesday night & Wednesday morning (til 07:00)
    (timedelta(hours=12), 'A'),
    (timedelta(hours=12), 'B'), # Wednesday night & Thursday morning (til 07:00)
    (timedelta(hours=12), 'A'),
    (timedelta(hours=12), 'B'), # Thursday night & Friday morning (til 07:00)
    (timedelta(hours=12), 'A'),
    (timedelta(hours=53), 'B'), # Friday night & the weekend
)

for start, end, label in partition_shift(shift, partitions):
    print(f"'{start}' - '{end}', label: {label}")

Output:

'2019-10-21 18:30:00' - '2019-10-21 19:00:00', label: A
'2019-10-21 19:00:00' - '2019-10-22 07:00:00', label: B
'2019-10-22 07:00:00' - '2019-10-22 08:00:00', label: A

This approach assumes that the partitions are input as successive offsets from the beginning of that week. The question did not specify how the partitions would be provided, so I choose to use this format. It's nice because it guarantees they do not overlap, and uses time deltas instead of being fixed to some particular date.

Converting other ways of specifying partitions into this one, or adapting this answer to work with other ways of specifying partitions has been left as an exercise to the reader.

Here's another usage example, using the same partitions as before, but a shift that starts in the previous week, thereby demonstrating that this approach works even when the shift spans multiple weeks.

shift = (datetime(2019, 10, 19, 18, 30), datetime(2019, 10, 22, 8, 0))

for start, end, label in partition_shift(shift, partitions):
    print(f"'{start}' - '{end}', label: {label}")

Output:

'2019-10-19 18:30:00' - '2019-10-21 07:00:00', label: B
'2019-10-21 07:00:00' - '2019-10-21 19:00:00', label: A
'2019-10-21 19:00:00' - '2019-10-22 07:00:00', label: B
'2019-10-22 07:00:00' - '2019-10-22 08:00:00', label: A

dirck · Answer 2 · 2021-10-09T19:51:09.210

I'd approach this by building the generic data structure, and then mapping week-minutes on top of it.

The generic structure looks like this:

class OrderedRangeMap:
    """ ranges must be contiguous ; 0..limit """
    def __init__(self, limit, default_value=""):
        self.ranges = [(0,default_value),(limit,None)]

    def find(self, key):
        # could do bsearch
        # what if value < self.ranges[0]?
        kv = self.ranges[0]
        if key < kv[0]:
            return None,0,False
        # what if value = self.ranges[0]?
        # what if value == vl[0]?
        for i,kv in enumerate(self.ranges):
            k = kv[0]
            if key < k:
                return kvp,i-1,False
            if key == k:
                return kv,i,True
            kvp = kv
        # off the end
        return None, len(self.ranges)-1, False

    def add(self, skey, ekey, value):
        newblock = (skey,value)
        oldblock,si,sx = self.find(skey)
        endblock,ei,ex = self.find(ekey)
        if sx:  #if start match, replace the oldblock
            self.ranges[si] = newblock
        else:   #else insert after the oldblock
            # bump
            si += 1
            ei += 1
            self.ranges.insert(si,newblock)
        if si == ei:
            # insert the split block after that
            self.ranges.insert(si+1,(ekey,oldblock[1]))
        else:
            # different blocks
            # end block starts at new end point
            self.ranges[ei] = (ekey,endblock[1])
            # delete any in between
            del self.ranges[si+1:ei]
        # is that it?

    def __getitem__(self, key):
        block,index,match = self.find(key)
        if index >= len(self.ranges) - 1:
            return block[0], block[0], block[1]
        return block[0], self.ranges[index+1][0], block[1]


def test_orm():
    orm = OrderedRangeMap(100, "B")
    assert orm.ranges == [(0,"B"),(100,None)]
    # s/e in same block
    orm.add(10,20, "A")
    assert orm.ranges == [(0,"B"),(10,"A"),(20,"B"),(100,None)]
    # s/e in same blocks, matches
    orm.add(10,13, "a")
    assert orm.ranges == [(0,"B"),(10,"a"),(13, "A"),(20,"B"),(100,None)]
    # more blocks
    orm.add(30,50, "c")
    assert orm.ranges == [(0,"B"),(10,"a"),(13, "A"),(20,"B"),(30,"c"),(50,"B"),(100,None)]
    # s/e in different blocks, no matches
    orm.add(15,33, "d")
    assert orm.ranges == [(0,"B"),(10,"a"),(13, "A"),(15,"d"),(33,"c"),(50,"B"),(100,None)]
    # s/e in different blocks, s matches
    orm.add(15,44, "e")
    assert orm.ranges == [(0,"B"),(10,"a"),(13, "A"),(15,"e"),(44,"c"),(50,"B"),(100,None)]
    # s/e in different blocks, s & e matches
    orm.add(13,50, "f")
    assert orm.ranges == [(0,"B"),(10,"a"),(13, "f"),(50,"B"),(100,None)]
    # NOT tested: add outside of original range

test_orm()

(Edited to add:)

The upper layer converts from datetime to week minutes

import datetime
class WeekShiftLabels:
    # this is assuming Monday=0

    week_minutes = 7*24*60

    def __init__(self, default_label="?"):
        self.orm = OrderedRangeMap(self.week_minutes, default_label)

    def add(self, dow, starttime, endtime, label):
        dm = dow * 24*60
        st = dm + t2m(starttime)
        et = dm + t2m(endtime)
        self.orm.add(st, et, label)

    def __getitem__(self, dt):
        wm = dt2wm(dt)
        block,index,match = self.find(wm)
        if index >= len(self.ranges) - 1:
            return None
        return block[1]

    class WSLI:
        # This doesn't handle modulo week_minutes
        def __init__(self, wsl, sdt, edt):
            self.wsl = wsl
            self.base = sdt - datetime.timedelta(days=sdt.weekday())
            t = sdt.time()
            self.base -= datetime.timedelta(hours=t.hour,minutes=t.minute)
            self.i = dt2wm(sdt)
            self.em = dt2wm(edt)
        def __next__(self):
            if self.i < 0:
                raise StopIteration
            block,index,match = self.wsl.orm.find(self.i)
            if not block:
                raise StopIteration # or something else
            start = wm2dt(self.base, self.i)
            end = self.wsl.orm.ranges[index+1][0]
            if end >= self.em:
                end = self.em
                self.i = -1
            else:
                self.i = end
            end = wm2dt(self.base, end)
            return start, end, block[1]

        def __iter__(self):
            return self

    def __call__(self, sdt, edt):
        return self.WSLI(self, sdt, edt)

def dt2wm(dt):
    t = dt.time()
    return dt.weekday() * 24*60 + t.hour*60 + t.minute

def wm2dt(base,wm):
    return base + datetime.timedelta(minutes=wm)

def t2m(t):
    return t.hour*60 + t.minute

def test_wsl():
    wsl = WeekShiftLabels("B")
    st = datetime.time(hour=7)
    et = datetime.time(hour=19)
    for dow in range(0,6):
        wsl.add(dow, st, et, "A")
    r = list(wsl(datetime.datetime(2019, 10, 21, 18, 30), datetime.datetime(2019, 10, 22, 8, 0)))
    assert len(r) == 3
    assert r[0]==(datetime.datetime(2019, 10, 21, 18, 30), datetime.datetime(2019, 10, 21, 19, 0), 'A')
    assert r[1]==(datetime.datetime(2019, 10, 21, 19, 0), datetime.datetime(2019, 10, 22, 7, 0), 'B')
    assert r[2]==(datetime.datetime(2019, 10, 22, 7, 0), datetime.datetime(2019, 10, 22, 8, 0), 'A')

test_wsl()

WTRipper · Answer 3 · 2019-10-29T06:44:13.733

You have not defined what happens if one of your shift limits (start or end) lays inside one part of your week and the other limit lays outside. For example what happens if you are having

2019-10-21 18:30 - 2019-10-21 19:00

Is it A or B? You could either make rules that if one part is in "B" the label would be B or just test the start or end or you could take the average., etc. So I will show how to check if one specific datetime lays inside the intervals. I don't know of a library automating this task more than the datetime library.

datetime library

import datetime

now = datetime.datetime.now()
hour = now.hour
# day of the week as int, where Mon 0 and Sun 6
day = now.weekday()

if day < 5 and hour >= 7 and hour < 19:
    label = "A"
else:
    label = "B"

print(label)

You could also check if hour or day lays in a range or list. For example if you want to consider a break between 12 and 13 o'clock:

if hour in range(7, 13) or hour in range(13, 19):
    # do something

docs for more info: https://docs.python.org/3.8/library/datetime.html People also often recommend the Pendulum library but by scrolling over it's docs I can't see any method that makes your task easier than the above code. Of course you could do something like this (but it seems not to be easier for me; code is not tested):

alternate solution using pendulum

import pendulum

now = pendulum.now()
daystart = now.start_of('day')
weekstart = now.start_of('week')

if now < weekstart.add(days=5) and now > daystart.add(hours=7) and now < daystart.add(hours=19):
    label = "A"
else:
    label = "B"

pendulum docs: https://pendulum.eustace.io/docs

It must be stated here though that both solutions can be done in this way (or with some adjustments) in both libraries (pendulum and datetime) and probably in many others I haven't mentioned as well.

Bonus

Since you asked for a way to handle such things more generally I will show you one last thing how you could use the first solution and make it a bit more generic:

import datetime

gethour = lambda dt : dt.hour
getday = lambda dt : dt.weekday()

timeframes = {
    "A": {
        getday: range(0,6),
        gethour: range(7,13) + range(13,19)
    },
    "break": {
        getday: range(0,6),
        gethour: [12]
    }
}
default = "B"

now = datetime.datetime.now()

for tag, timeframe in timeframes.items():
    label = tag
    for getter, limit in timeframe.items():
        if not getter(now) in limit:
            label = default
            break
    if label != default:
        break

print(label)

I am not asking how to decide the label of a single `datetime`. Rather, I want to split up an interval of `datetime`s into (maximal) subintervals such that each subinterval has a single label. — Ricardo Buring, Oct 29 '19 at 13:17
@Ricardo Buring you are right I misunderstood your question. I'll update my answer tomorrow. — WTRipper, Oct 30 '19 at 00:12

Split up datetime interval according to labeled partition of a week

3 Answers3