Algorithms question: Largest contiguous subarray selection

Question

Q. Given two arrays, A and B, of equal length, find the largest possible contiguous subarray of indices [i,j] such that max(A[i: j]) < min(B[i: j]).

Example: A = [10, 21, 5, 1, 3], B = [3, 1, 4, 23, 56]

Explanation: A[4] = 1, A[5] = 3, B[4] = 23, B[5] = 56, max(A[4], A[5]) < min(B[4], B[5])

The indices are [4,5] (inclusive), and the largest contiguous subarray has length 2

I can do this in O(n2) brute force method but cannot seem to reduce the time complexity. Any ideas?

B[i,j] contains all the elements from index i to index j from the array B — David jones, Sep 14 '21 at 15:33
It is not formalized correctly. The constraint you've mentioned is `max(A[i, j]) < B[i, j]`, but `B[i, j]` is a subarray of ints, so what exactly did you mean? It is that `max(A[i, j])` should be smaller than **any** of the elements in `B[i, j]`? — ronpi, Sep 14 '21 at 15:33
So how _a single value_ `max(A[i,j])` chosen from `A` gets compared to a _set of values_ taken from `B`? — CiaPan, Sep 14 '21 at 15:36
It's in the description, guys: *"max(A[4], A[5]) < min(B[4], B[5])"* — user2390182, Sep 14 '21 at 15:38
A[i,j] = A[i : j] (like in python syntax but j th element is inclusive). It contains all the elements from the i th element to the jth element — David jones, Sep 14 '21 at 15:38
Where is this from? Can solutions be tested online somewhere? — no comment, Sep 14 '21 at 15:45
What did you try so far? I think we can achieve O(n) by using approach similar to KMP algorithm, which can result O(n) complexity. — Thinhbk, Sep 14 '21 at 15:46
The question is from an archived material of an Algorithms course. Since it is a theoretical course, there are no test cases to be tested for correctness — David jones, Sep 14 '21 at 15:47
@Thinhbk can you explain your approach? I can't see how you are relating to KMP. Also the length of the arrays are same as mentioned in the question — David jones, Sep 14 '21 at 15:48
I think the key approach is to have 2 for loop for matching/ validating, but the outer loop does not have to jump back once finding a violation to the condition from the inner loop, which resulting O(n) complexity instead of O(n^2) complexity. — Thinhbk, Sep 14 '21 at 16:04
Want to keep it free from any language and codes, as it is a theoretical question — David jones, Sep 14 '21 at 16:14
One idea that comes to mind is using mono queues prefix and suffix for max of A and min of B. I think there may be a linear solution from there but I'm not sure of it. — user1984, Sep 14 '21 at 16:18
Does `divide-and-conquer` tag refers to corresponding chapter of algorithms course? — MBo, Sep 14 '21 at 16:45

no comment · Accepted Answer · 2021-09-16T05:01:37.887

O(n) solution:

Move index j from left to right and drag i behind so that the window from i to j is valid. So always increase j by 1, and then increase i as much as needed for the window to be valid.

To do that, keep a queue Aq of indexes of non-increasing A-values. Then A[Aq[0]] tells you the max A-value in the window. Similarly, keep a queue for non-decreasing B-values.

For each new j, first update Aq and Bq according to the new A-value and new B-value. Then, while the window is invalid, increase i and drop Aq[0] and Bq[0] if they're i. When both j and i are updated, update the result with the window size j - i - 1.

Python implementation:

def solution(A, B):
    Aq = deque()
    Bq = deque()
    i = 0
    maxlen = 0
    for j in range(len(A)):
        while Aq and A[Aq[-1]] < A[j]:
            Aq.pop()
        Aq.append(j)
        while Bq and B[Bq[-1]] > B[j]:
            Bq.pop()
        Bq.append(j)
        while Aq and A[Aq[0]] >= B[Bq[0]]:
            if Aq[0] == i:
                Aq.popleft()
            if Bq[0] == i:
                Bq.popleft()
            i += 1
        maxlen = max(maxlen, j - i + 1)
    return maxlen

Test results from comparing against a naive brute force reference solution:

expect:  83   result:  83   same: True
expect: 147   result: 147   same: True
expect: 105   result: 105   same: True
expect:  71   result:  71   same: True
expect: 110   result: 110   same: True
expect:  56   result:  56   same: True
expect: 140   result: 140   same: True
expect: 109   result: 109   same: True
expect:  86   result:  86   same: True
expect: 166   result: 166   same: True

Testing code (Try it online!)

from timeit import timeit
from random import choices
from collections import deque
from itertools import combinations

def solution(A, B):
    Aq = deque()
    Bq = deque()
    i = 0
    maxlen = 0
    for j in range(len(A)):
        while Aq and A[Aq[-1]] < A[j]:
            Aq.pop()
        Aq.append(j)
        while Bq and B[Bq[-1]] > B[j]:
            Bq.pop()
        Bq.append(j)
        while Aq and A[Aq[0]] >= B[Bq[0]]:
            if Aq[0] == i:
                Aq.popleft()
            if Bq[0] == i:
                Bq.popleft()
            i += 1
        maxlen = max(maxlen, j - i + 1)
    return maxlen

def naive(A, B):
    return max((j - i + 1
                for i, j in combinations(range(len(A)), 2)
                if max(A[i:j+1]) < min(B[i:j+1])),
               default=0)

for _ in range(10):
    n = 500
    A = choices(range(42), k=n)
    B = choices(range(1234), k=n)
    expect = naive(A, B)
    result = solution(A, B)
    print(f'expect: {expect:3}   result: {result:3}   same: {result == expect}')

Beat me a minute with same idea ;) [longer code](https://ideone.com/M1brIL) — MBo, Sep 14 '21 at 18:17
@MBo Great, now I *really* don't know what to do next. More properly test my code, read the answer that claims O(n) without extra data structures, or read yours :-) — no comment, Sep 14 '21 at 18:30
No need to read mine - similar idea, worse implementation ;) — MBo, Sep 14 '21 at 18:35
An awesome solution indeed. Thanks for helping me with the solution. Order of N solutions are really something! — David jones, Sep 14 '21 at 19:18

Thinhbk · Answer 2 · 2021-09-16T14:05:48.023

1

I can see based on the problem, saying we have 2 conditions:

max(A[i,j-1]) < min(B[i,j-1])
max(A[i,j]) >= min(B[i,j])
saying maxA is index of max item in A array in [i,j], minB is index of min item in B array in [i,j]; and anchor is min(maxA, minB)

Then we will have: max(A[i+k,anchor]) >= min(B[i+k,anchor]) ∀ k in [i+1,anchor].

So I come up with a simple algorithm like below:

    int extractLongestRange(int n, int[] A, int[] B) {
        // n is size of both array
        int size = 0;
        for(int i = 0; i < n; i++){
            int maxAIndex = i;
            int minBIndex = i;
            for(int j = i; j < n; j++){
                if(A[maxAIndex] < A[j]){
                    maxAIndex = j;
                }
                if(B[minBIndex] > B[j]){
                    minBIndex = j;
                }
                if(A[maxAIndex] >= B[minBIndex]){
                    if(size < j - i){
                        size = j - i;
                    }
                    // here, need to jump back to min of maxAIndex and minBIndex.
                    i = Math.min(maxAIndex, minBIndex);
                    break;
                }
                // this case, if j reach the end of array
                if(j == n - 1){
                    if(size < j - i + 1){
                        size = j - i + 1;
                        i = j;
                    }
                }
            }
        }
        return size; 
}

With this approach, the complexity is O(n). We can change the output to pick-up the other information if needed.

edited Sep 16 '21 at 14:05

answered Sep 14 '21 at 17:36

Thinhbk

2,194
1
23
34

This solution is also awesome and is the answer to the problem. Thanks for spending some time on this problem. It helped me greatly. – David jones Sep 14 '21 at 19:19
But, it is giving wrong answer for arr1[] = {1, 2, 25, 1, 100}, arr2[] = {3, 10, 4, 23, 56};. Answer = 2, answer this function gives: 3 – David jones Sep 14 '21 at 19:28
Good catch, have to move break to outer if condition, once there're chances that the 2 last condition matches. Updated. – Thinhbk Sep 14 '21 at 19:48
@Thinhbk Could you add a link to a full program for easier testing? – no comment Sep 14 '21 at 20:01
@Davidjones It's wrong, though. Already the "observation" is wrong. – no comment Sep 14 '21 at 20:06
@don'ttalkjustcode: I am using java for my implementation, don't know how to add a link for it to be honest. I can convert my algorithm to javascript and put it to jsfiddle if you are interested. – Thinhbk Sep 15 '21 at 01:08
1

@Thinhbk For example [like this](https://ideone.com/jXkh5B). Which contains an example case that hopefully makes clear why your approach doesn't work: `A = {1, 1, 1, 1, 2, 2, 2, 2, 2}` `B = {2, 3, 3, 3, 3, 3, 3, 3, 3}`. You compute 5 instead of 8. – no comment Sep 15 '21 at 06:11
Hmm,I know the reason why my algo does not work with this input, I think i (the outer index) should jump back the the min of maxA index and minB index to be safe, that should work. – Thinhbk Sep 16 '21 at 13:52
For `A = {2}` `B = {1}` your new code crashes. – no comment Sep 16 '21 at 14:29

score 0 · Answer 3 · answered Sep 14 '21 at 16:53

This can be solved in O(n log(n)).

Here is an outline of my technique.

My idea looks like a "rising water level" of maximum in A, keeping track of the "islands" emerging in A, and the "islands" submerging in B. And recording the maximum intersections after they appear from A emerging, or before they disappear from B sinking.

We will need 2 balanced binary trees of intervals in A and B, and a priority queue of events.

The tree of intervals need to support a logarithmic "find and return interval containing i if it exists". It also needs to support a logarithmic "add i to tree, extending/merging intervals as appropriate, and return new interval". And likewise a logarithmic "remove i from tree, shortening, splitting, or deleting its interval as appropriate".

Events can be of the form "remove B[i]" or "add A[i]". The priority queue is sorted first by the size of the element added/subtracted, and then by putting B before A. (So we do not elements of size x to A until all of the elements of size x are removed from B.)

With that in mind, here is pseudocode for how to do it.

Put all possible events in the priority queue
Initialize an empty tree of intervals A_tree
Initialize a tree of intervals B_tree containing the entire range
Initialize max interval to (0, -1) (length 0)

For each event (type, i) in queue:
    if event.type = A:
        new_interval = A_tree.add(event.i)
        
        if event.i in B_tree:
            Intersect event.i's A_tree interval with event.i's B_tree interval
            if intersection is longer than max_interval:
                update to new and better max_interval

    else:
        if event.i in A_tree:
            Intersect event.i's A_tree interval with event.i's B_tree interval
            if intersection is longer than max_interval:
                update to new and better max_interval

        B_tree.remove(event.i)

Processing any event is O(log(n). There are 2n = O(n) events. So overall time is O(n log(n)).

Algorithms question: Largest contiguous subarray selection

3 Answers3