2

Say you are given a list of integer pairs pairs and two integers k1 and k2.
Find the count of pairs of pairs from the list that fulfills:

  • pairs[i][0] + pairs[j][0] <= k1
  • pairs[i][1] + pairs[j][1] <= k2
  • for i < j

For example, if, pairs=[[1,2],[2,3],[3,4],[4,5]], k1=6 and k2=7, the result should be 4, since ([1,2],[2,3]), ([1,2],[3,4]), ([1,2],[4,5]) and ([2,3],[3,4]) all satisfy the condition stated above.

See picture for better description of question:

enter image description here

Is there a way to can solve this question with better efficiency than O(n^2)? This is my solution so far:

pairs = [[1,2],[2,3],[3,4],[4,5]]
k1 = 6
k2 = 7

count = 0
n = len(pairs)

for i in range(n):
    for j in range(i+1, n):
        if pairs[i][0]+pairs[j][0] <= k1 and pairs[i][1]+pairs[j][1] <= k2:
            count += 1

print(count)
Rodrigo Rodrigues
  • 7,545
  • 1
  • 24
  • 36
seekerpig
  • 47
  • 2

2 Answers2

5

This is O(n log n) and solves the largest allowed inputs (n=2×10^5) in about 1.5 seconds:

pairs = deque(sorted(pairs))
ys = SortedList(y for _, y in pairs)
count = 0
while pairs:
    x, y = pairs[0]
    X, Y = pairs[-1]
    if x + X <= k1:
        ys.remove(y)
        count += ys.bisect_right(k2 - y)
        pairs.popleft()
    else:
        ys.remove(Y)
        pairs.pop()

Consider the pairs as (x,y) pairs. Sort them by x-coordinate, then move inwards. Let (x,y) be the leftmost pair and (X,Y) be the rightmost pair.

  • If x+X ≤ k1, then, concerning the x-coordinate, the leftmost pair can be combined with all other pairs (since X is the largest). But how many of them also have a fitting y-coordinate, i.e., y+yother ≤ k2? That means yother ≤ k2-y. For this, we keep all y-coordinates in a sorted list. Since we want other pairs, we first remove the leftmost pair's own y. Then we binarysearch for k2-y, which tells us the number of fitting other pairs. Finally, we remove the leftmost pair from further consideration, since we counted all its contributions.

  • If x+X > k1, then the rightmost pair has a too large X to be combined with any other pair. So we just remove its Y from the y-list and remove the pair.

I used SortedList there. If we use a Python list instead, the algorithm takes O(n^2), because del takes linear time. But with a very small constant factor, so it still only takes about 3 seconds for the largest allowed inputs.

Test results with your small example and larger random inputs:

4 pairs:
  count=4  0.000 s  original
  count=4  0.000 s  Kelly_SortedList
  count=4  0.000 s  Kelly_list

1000 pairs:
  count=51328  0.144 s  original
  count=51328  0.003 s  Kelly_SortedList
  count=51328  0.002 s  Kelly_list

6000 pairs:
  count=1845645  4.786 s  original
  count=1845645  0.023 s  Kelly_SortedList
  count=1845645  0.012 s  Kelly_list

200000 pairs:
  count=2022417695  1.490 s  Kelly_SortedList
  count=2022417695  2.944 s  Kelly_list

400000 pairs:
  count=8075422313  3.454 s  Kelly_SortedList
  count=8075422313 12.957 s  Kelly_list

Code for that:

from bisect import bisect_left, bisect_right
from collections import deque
from random import randrange
from time import perf_counter as time
from sortedcontainers import SortedList


def original(pairs, k1, k2):
    count = 0
    n = len(pairs)
    for i in range(n):
        for j in range(i+1, n):
            if pairs[i][0]+pairs[j][0] <= k1 and pairs[i][1]+pairs[j][1] <= k2:
                count += 1
    return count


def Kelly_SortedList(pairs, k1, k2):
    pairs = deque(sorted(pairs))
    ys = SortedList(y for _, y in pairs)
    count = 0
    while pairs:
        x, y = pairs[0]
        X, Y = pairs[-1]
        if x + X <= k1:
            ys.remove(y)
            count += ys.bisect_right(k2 - y)
            pairs.popleft()
        else:
            ys.remove(Y)
            pairs.pop()
    return count


def Kelly_list(pairs, k1, k2):
    pairs = deque(sorted(pairs))
    ys = sorted(y for _, y in pairs)
    count = 0
    while pairs:
        x, y = pairs[0]
        X, Y = pairs[-1]
        if x + X <= k1:
            del ys[bisect_left(ys, y)]
            count += bisect_right(ys, k2 - y)
            pairs.popleft()
        else:
            del ys[bisect_left(ys, Y)]
            pairs.pop()
    return count


funcs = original, Kelly_SortedList, Kelly_list

def test(funcs, *args):
    print(len(args[0]), 'pairs:')
    for f in funcs:
        t0 = time()
        print(f'  count={f(*args)}', f'{time() - t0 :6.3f} s ', f.__name__)
    print()

def gen(n):
    pairs = [[randrange(2*10**5), randrange(2*10**5)] for _ in range(n)]
    k1 = 15 * 10**4
    k2 = 17 * 10**4
    return pairs, k1, k2

test(funcs, [[1,2],[2,3],[3,4],[4,5]], 6, 7)
test(funcs, *gen(1000))
test(funcs, *gen(6000))
test(funcs[1:], *gen(2*10**5))
test(funcs[1:], *gen(4*10**5))
Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
1

Here's O(n) space and O(n log n) time. Given an order statistic tree, ys, the pairs sorted by their first element; and two pointers, r at the last pair's index, and l at 0:

result = 0
while r > l:
  while pairs[l][0] + pairs[r][0] <= k1:
    add pairs[l][1] to ys
    l += 1
  result += count of ys <= (k2 - pairs[r][1])
  r -= 1
while r > 0:
  if r < l:
    remove one pairs[r][1] from ys
  result += count of ys <= (k2 - pairs[r][1])
  r -= 1
return result

This adds the number of pairs that can be matched for each pair at the right index. As the window narrows, all the pairs on the left (with smaller first elements) remain candidates, and more can be added as the fixed pair on the right has a smaller and smaller first element. The order statistic tree helps us answer for the pair on the right how many of the candidates on the left also abide by the second element restriction.

(Please note that the pseudocode is untested and may not include handling for all cases, where the pointers are in various states between the code blocks presented.)

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61