Find the positions that matches a condition on parent and child lists

Question

Given a parent list with start and end times as numbers say (p1, p2):

1,5
2,2
4,10

Also another child list with their start and end times as (c1, c2):

2, 4
15,20

Find all the index positions from the parent and child list such that the below condition is satisfied:

p1 <= c1 <= c2 <= p2

For this example, the expected result is (0,0).

Explanation:

The valid combination is :

1 <= 2 <= 4 <= 5 that is position 0 from the parent list (1,5) matches with the condition for position 0 (2,4) of the child list.

So position 0 from the parent list and position 0 from the child list that is (0,0)

Constraints:

size of the parent and child list can be from 1 to 10^5
each element of this list can be from 1 to 10^9

Code that I tried:

static List<List<Integer>> process(List<List<Integer>> parent, List<List<Integer>> child) {
    List<List<Integer>> answer = new ArrayList<>();
    for(int i=0; i<parent.size(); i++) {
        List<Integer> p = parent.get(i);
        int p1 = p.get(0);
        int p2 = p.get(1);
        for(int j=0; j<child.size(); j++) {
            List<Integer> c = child.get(j);
            int c1 = c.get(0);
            int c2 = c.get(1);
            if((p1 <= c1) && (c1 <= c2) && (c2 <= p2)) {
                answer.add(Arrays.asList(i, j));
            }
        }
    }
    return answer;
}

This code works for small inputs but fails for larger list sizes with time-out errors. What is the best approach to solve this problem?

May be worth a look if you can work with a 3rd party lib :[guava-rangeset](https://www.baeldung.com/guava-rangeset) — Eritrean, Dec 17 '22 at 22:20
Is this from some online training or competition site? Will you share a link, please? — Ole V.V., Dec 18 '22 at 02:45
Thanks. Do we agree that `process(Arrays.asList(Arrays.asList(2,7)), Arrays.asList(Arrays.asList(1,6), Arrays.asList(3,8), Arrays.asList(4,5)))` ought to give [[0, 2]]? I get []. — Ole V.V., Dec 18 '22 at 03:52
hmm ... not sure if I understand the problem completely, but: basically you have a bunch of closed intervals, want to decide if one is inside of another and store the indices of both container and containee. My first step would be to declare a class encapsulating an interval with a boolean method that decides the insideness. Then have nested loops across parent/child and store the loop indices if contained. That's working - correct answer also for @OleV.V. case - but don't know how it performs for really big lists. — kleopatra, Dec 20 '22 at 13:55
right now, you have two problems: a) your code is not correct (producing unexpected results for certain data) b) performance - so I would suggest to first make the code work correctly. Then profile to find the bottleneck/s and fix those (with tests in place to guarantee correctness). — kleopatra, Dec 20 '22 at 14:12
@kleopatra, I added the nested loops approach now and removed the other approach. Now I have only time-out errors for large inputs. — learner, Dec 20 '22 at 14:37
curious: what's the context of the question? what are you supposed to learn from it? BTW: your current code returns (childIndex, parentIndex) while the description seems to require (parentIndex, childIndex) — kleopatra, Dec 20 '22 at 15:29
@kleopatra, it's a typo error, I fixed it. I am looking for a program that takes less time to run. — learner, Dec 20 '22 at 16:20
yeah, your question is explicit on what you want ;) But wondering why you insist on not explaining your context - you didn't answer @OleV.V. nor me .. - that context might have a hint for you on how to solve this. — kleopatra, Dec 20 '22 at 16:48
@kleopatra, @ Ole V.V. this was asked during an interview some months back, I am trying to understand how this can be solved in less time. does this answer the context now? — learner, Dec 20 '22 at 19:42
In your problem description you said you want to get *the* index but your code tries to find *all* indices. — Holger, Dec 21 '22 at 13:56
@Holger, basically there can be multiple positions where the condition satisfies. So I need to collect all the positions. — learner, Dec 21 '22 at 14:41

NiceGuySaysHi · Answer 1 · 2022-12-29T11:34:36.173

Lets consider each interval as an event. One idea would be to sort the parent and child list and then scan them from left to right. While doing this, we keep track of the "active events" from the parent list. For example, if the parent list has events e1 = (1, 5), e2 = (8, 11) and the child list has events e1' = (2, 6), e2' = (9, 10), a scan would look like this: start event e1 -> start event e1' -> end event e1 -> end event e1' -> start event e2 -> start event e2' -> end event e2' -> end event e2. While scanning, we keep track of the active events from the parent list by adding them to binary search tree, sorted by starting point. When we end an event ek' from the child list, we search for the starting point of ek' in the binary tree, and that way find all intervals, that have a smaller key. We can pair all of these up with the child Intervall and add it to the solution. The total time complexity is still O(n^2), since it is possible, that every child interval is in every parent interval. However, the complexity should be close to log(n)*n, if there is a very low amount of These pairs. I got part of the idea from the following link, so looking at this might help you to understand, what i am doing: Sub O(n^2) algorithm for counting nested intervals?

score 1 · Answer 2 · answered Dec 27 '22 at 09:10

Consider an alternative algorithm

The posted code is slow for large inputs, because it checks all combinations of parents and children, even for inputs where the number of answers will be a relatively small set. I put an emphasis on the last point, to highlight that when all children are within all parents, then the answer must contain all pairings.

A more efficient solution is possible for inputs where the number of answers is significantly smaller than all possible pairings. (And without degrading the performance in case the answer is the complete set.)

Loop over the interesting positions from left to right. An interesting position is where a parent or child interval starts or ends.
If the position is a parent:
- If this the start of the parent, add the parent to a linked hashset of started parents.
- Otherwise it's the end of the parent. Remove this parent from the linked hashset.
If the position is the start of a child:
- Loop over the linked hashset of started parents
  - If the parent was started before the child, add the index pair to the answers.
  - Break out of the loop, the remaining started parents were started after the child.

The key element that makes this fast is the following properties of a linked hashset:

Adding an item is O(1)
Removing an item is O(1)
The insertion order of items is preserved

The last point is especially important, combined with the idea that we are looping over positions from left to right, so we have the ordering that we need to eliminate parent-child pairs that won't be part of the answer.

The step of looping over interesting positions above is a bit tricky. Here's one way to do it:

Define a new class to use for sorting, let's call it Tracker. It must have:
- Position of an interesting index: the start or end of a parent or child
- A flag to indicate if this position is a start or an end
- A flag to indicate if this is a parent or a child
- The original index in the parent or child list
Build a list of Tracker instances from the parent and child lists
- For each parent, add two instances, one for the start and one for the end
- For each child, add two instances, one for the start and one for the end
Sort the list, keeping in mind that the ordering is a bit tricky:
- Must be ordered by position
- When the position is the same, then:
  - The start of a parent must come before its own end
  - The start of a child must come before its own end
  - The start of a parent at some position X must come before the start of a child at the same position X
  - The end of a child at some position X must come before the end of a parent at the same position X

Evaluating the alternative algorithm

Given input with M parents and N children, there are M * N possible combination of pairs. To contrast the performance of the original and the suggested algorithms, let's also consider a case where only a small subset of parents contain only a small subset of children, that is, let's say that on average X parents contain Y children.

The original code will perform M * N comparisons, most of them will not be part of the answer.

The suggested alternative will perform an initial search step of 2 * (M + N) items, which is a log-linear operation: O(log (M + N)). Then the main part of the algorithm performs linear logic, generating the X * Y pairs with constant overhead: O(M + N). The linked hashset makes this possible.

When X * Y is very close to M * N, the overhead of the alternative algorithm may outweigh the benefits it brings. However, the overhead grows log-linearly with M + N, which is significantly smaller than M * N.

In other words, for large values of M and N and a uniformly random distribution of X and Y, the alternative algorithm will perform significantly better on average.

Ordering of the pairs in the answer

I want to point out that the question doesn't specify the ordering of pairs in the answers. If a specific ordering is required, it should be easy to modify the algorithm accordingly.

Alternative implementation

Here's an implementation of the ideas above, and assuming that the pairs in the answer can be in any order.

List<List<Integer>> findPositions(List<List<Integer>> parent, List<List<Integer>> child) {
  List<Tracker> items = new ArrayList<>();

  // add the intervals with their original indexes from parent, and the parent flag set to true
  for (int index = 0; index < parent.size(); index++) {
    List<Integer> item = parent.get(index);
    items.add(new Tracker(item.get(0), true, index, true));
    items.add(new Tracker(item.get(1), false, index, true));
  }

  // add the intervals with their original indexes from child, and the parent flag set to false
  for (int index = 0; index < child.size(); index++) {
    List<Integer> item = child.get(index);
    items.add(new Tracker(item.get(0), true, index, false));
    items.add(new Tracker(item.get(1), false, index, false));
  }

  // sort the items by their position,
  // parent start before child start,
  // child end before parent end,
  // start before end of child/parent
  items.sort(Comparator.<Tracker>comparingInt(tracker -> tracker.position)
    .thenComparing((a, b) -> {
      if (a.isStart) {
        if (b.isStart) return a.isParent ? -1 : 1;
        return -1;
      }
      if (b.isStart) return 1;
      return a.isParent ? 1 : -1;
    }));

  // prepare the list where we will store the answers
  List<List<Integer>> answer = new ArrayList<>();

  // track the parents that are started, in their insertion order
  LinkedHashSet<Integer> startedParents = new LinkedHashSet<>();

  // process the items one by one from left to right
  for (Tracker item : items) {
    if (item.isParent) {
      if (item.isStart) startedParents.add(item.index);
      else startedParents.remove(item.index);
    } else {
      if (!item.isStart) {
        int childStart = child.get(item.index).get(0);
        for (int parentIndex : startedParents) {
          int parentStart = parent.get(parentIndex).get(0);
          if (parentStart <= childStart) {
            answer.add(Arrays.asList(parentIndex, item.index));
          } else {
            break;
          }
        }
      }
    }
  }
  return answer;
}

private static class Tracker {
  final int position;
  final boolean isStart;
  final int index;
  final boolean isParent;

  Tracker(int position, boolean isStart, int index, boolean isParent) {
    this.position = position;
    this.isStart = isStart;
    this.index = index;
    this.isParent = isParent;
  }
}

score 0 · Answer 3 · answered Dec 22 '22 at 15:29

Firstly, here you can add a break in the if the condition:

if((p1 <= c1) && (c1 <= c2) && (c2 <= p2)) {
  answer.add(i);
  answer.add(j);
  break;
}

Secondly, As this code has a time complexity of O(n^2) and tends to take time as your input increase to minimize it you can use some other data structures like trees where you get searching in O(log n) time.

RBinaryTree<Pair> tree = new RBinaryTree<>();

and in if condition...

 tree.add(new Pair(i, j));

Create a Pare class like

private static class Pair {
    int p;
    int c;

    Pair(int p, int c) {
        this.p = p;
        this.c = c;
    }
}

Also, you can use some other approaches like divide and conquer by dividing to list into sublists.

_"Also, you can use some other approaches like divide and conquer by dividing to list into sublists"_. Instead of providing vague/random hints, please add a proper approach to solve the problem — Abhinav Mathur, Dec 25 '22 at 06:44

score 0 · Answer 4 · answered Dec 26 '22 at 05:58

It's my honor to share my thoughts. Maybe there are still some shortcomings that I haven't found, please correct them. This is for reference only.

First, process the parent list and child list, and add a third element to represent their input order. Then we need to write a Comparator

    Comparator<List<Integer>> listComparator = (o1, o2) -> {
        if (o1.get(0) < o2.get(0)) {
            return -1;
        } else if (o1.get(0) > o2.get(0)) {
            return 1;
        }
        if (o1.get(1) < o2.get(1)) {
            return -1;
        } else if (o1.get(1) > o2.get(1)) {
            return 1;
        }
        return 0;
    }

and use list.stream ().sorted() to sort the elements in the list. At the same time, we can use list.stream().filter() to filter the illegal elements, so that we can get an ordered list; For the ordered list, we can search the parent list, find the elements that meet the size relationship in the child list, and record the index. In the subsequent element comparison of the parent list, we can directly start search from the record index.

Finally, the statistics results are sorted and output from small to large.

Here is the completion code:

static List<List<Integer>> process(List<List<Integer>> parent, List<List<Integer>> child) {
    // The third element represents the original order number
    int index = 0;
    for (List<Integer> list : parent) {
        list.add(index++);
    }
    index = 0;
    for (List<Integer> list : child) {
        list.add(index++);
    }
    Comparator<List<Integer>> listComparator = (o1, o2) -> {
        if (o1.get(0) < o2.get(0)) {
            return -1;
        } else if (o1.get(0) > o2.get(0)) {
            return 1;
        }
        if (o1.get(1) < o2.get(1)) {
            return -1;
        } else if (o1.get(1) > o2.get(1)) {
            return 1;
        }
        return 0;
    };
    List<List<Integer>> parentSorted = parent.stream().filter(integers -> integers.get(0) <= integers.get(1)).sorted(listComparator).collect(Collectors.toList());
    List<List<Integer>> childSorted = child.stream().filter(integers -> integers.get(0) <= integers.get(1)).sorted(listComparator).collect(Collectors.toList());
    int childPointer = 0;
    List<List<Integer>> answer = new ArrayList<>();
    for (int i = 0; i < parentSorted.size(); i++) {
        // Search the child list elements that meet the requirement that the parent list is greater than or equal to the ith element. The elements behind the parent list must be greater than or equal to the ith element. Therefore, for the following elements, you can directly search from the child list elements of the childPointer
        if (parentSorted.get(i).get(0) <= childSorted.get(childPointer).get(0)) {
            for (int j = childPointer; j < childSorted.size(); j++) {
                if (parentSorted.get(i).get(0) <= childSorted.get(j).get(0)) {
                    if (childSorted.get(j).get(1) <= parentSorted.get(i).get(1)) {
                        answer.add(Arrays.asList(parentSorted.get(i).get(2), childSorted.get(j).get(2)));
                    } else {
                        break;
                    }
                } else {
                    break;
                }
            }
        } else {
            // The child list pointer moves backward, and the parent list continues to judge the ith element
            childPointer++;
            i--;
        }
    }
    return answer.stream().sorted(listComparator).collect(Collectors.toList());
}

score 0 · Answer 5 · answered Dec 26 '22 at 10:02

Idea, it is similar to the balanced parenthesis ()()) is invalid and (())() is valid. Now we use (P1, -P1), (C1, -C1).... as as the symbols instead of (, ) where Pi is the start time for parent i and -Pi is the end time and similarly all the variables follow. We say Ci is balacned with Pi iff both Ci and -Ci are present between Pi and -Pi.

Some implementation detail, first sort all the numbers and make a stack and push the symbols from the start time (the first event), an example stack might lool like start: [P1, C3, P2, C2, C1, P3, -C2, -P1, -C3, -P3, -C1, -P2: top. Now maintain lists for all parents keeping track of the children between them and find the once that start and end in the scope of the parent i.e both Ci and -Ci are in list of Pi. Also the list closes when -Pi is read.

Hope this helps!

score 0 · Answer 6 · answered Dec 26 '22 at 11:20

Usage of Streams API from Java 8 might be able to process more efficiently but not sure if it would help your context

  static List<List<Integer>> process(List<List<Integer>> parent, List<List<Integer>> child) {
    List<List<Integer>> answer = new ArrayList<>();
    IntStream.range(0, parent.size()).forEach(parentIndex -> IntStream.range(0, child.size()).forEach(childIndex -> {
      List<Integer> p = parent.get(parentIndex);
      List<Integer> c = child.get(childIndex);
      int p1 = p.get(0);
      int p2 = p.get(1);
      int c1 = c.get(0);
      int c2 = c.get(1);
      if((p1 <= c1) && (c1 <= c2) && (c2 <= p2)) {
        answer.add(Arrays.asList(parentIndex, childIndex));
      }
    }));

    return answer;
  }

Following is another implementation using Streams API

  static List<List<Integer>> process(List<List<Integer>> parent, List<List<Integer>> child) {
    return
        IntStream.range(0, parent.size()).mapToObj(parentIndex ->
            IntStream.range(0, child.size()).filter(childIndex -> {
              List<Integer> p = parent.get(parentIndex);
              List<Integer> c = child.get(childIndex);
              int p1 = p.get(0);
              int p2 = p.get(1);
              int c1 = c.get(0);
              int c2 = c.get(1);
              return ((p1 <= c1) && (c1 <= c2) && (c2 <= p2));
            }).mapToObj(childIndex -> Arrays.asList(parentIndex, childIndex))
                .flatMap(Collection::stream).collect(Collectors.toList())
        ).filter(value -> !value.isEmpty()).collect(Collectors.toList());

  }

Find the positions that matches a condition on parent and child lists

6 Answers6

Consider an alternative algorithm

Evaluating the alternative algorithm

Ordering of the pairs in the answer

Alternative implementation