-1

Imagine that I have two periods, which would be dates, but are integers here for simplicity:

ID  START  END  VALUE
 A      3    5      2
 B      1    7      1

How do I get the intersection of the two start and end periods and the out-of-intersection parts of those two periods, returning something like this:

{ 'inner': [3, 5],
  'outer': [[1, 2], [6, 7]] }

The first idea I had was to decompose this down to running along the set of all date possibilities and showing which parts are in and out by simply marking them on every date. That, however, will take an astonishingly long time.

The second approach I thought of was to generate list of every single date inside both ranges and run a check for which stamps match, and then somehow reduce it back into a period range... but that too seems exceedingly inefficient.

Is there some way to do this inside the standard libraries? Or the pandas libraries?

ifly6
  • 5,003
  • 2
  • 24
  • 47
  • Related: https://stackoverflow.com/questions/46525786/how-to-join-two-dataframes-for-which-column-values-are-within-a-certain-range – cs95 Apr 11 '19 at 20:52
  • Perhaps this may be some kind of X/Y, but I'm trying to generate columns based on structural data which overlaps other data, and to take the overlapping data as valid for the period in which it overlaps. Merging a timestamp into an interval doesn't do that. – ifly6 Apr 11 '19 at 21:32

1 Answers1

0

I haven't found any similarly named questions which have working solutions. To avoid a Xkcd "What did you see" sort of situation, I'll post this.

I ended up developing an algorithm by hand. I have zero confidence that it is the best way to do this.

def range_intersection(set1, set2):
    end1_og = set1[1]
    set1[1] = set2[0] - 1
    set3 = [set2[1] + 1, end1_og]

    if set2[0] < set1[0]:
        # if 2 is older than 1's start, peg start 2 to start 1
        set2[0] = set1[0]

    new_sets = []
    if not set3[1] < set3[0]:
        new_sets.append(set3)
    if not set1[1] < set1[0]:
        new_sets.append(set1)

    return [set2, new_sets]

However it is, it does provide the correct answers for the test (and some other ones I tested as well):

>>> range_intersection([3, 5], [1, 7])
[[3, 5], [[6, 7], [1, 2]]]
ifly6
  • 5,003
  • 2
  • 24
  • 47