3

Given a binary array (element is 0 or 1), I need to find the maximum length of sub array having all ones for given range(l and r) in the array.

I know the O(n) approach to find such sub array but if there are O(n) queries the overall complexity becomes O(n^2).

I know that segment tree is used for such type of problems but I cant figure out how to build tree for this problem.

Can I build a segment tree using which I can answer queries in log(n) time such that for O(n) queries overall complexity will become O(nlog(n)).

efex09
  • 417
  • 2
  • 12
  • Your goal is unclear. What are these `O(n)` queries? Do you actually want to build a list of sub-arrays in order of decreasing length? – Nelfeal Nov 09 '18 at 10:41
  • Suppose array has 100 elements then there can be 100 queries at max. – efex09 Nov 09 '18 at 10:45
  • You say "I know the O(n) approach to find such sub array"; what is this approach and how does the overall complexity suddenly become `O(n^2)`? – Nelfeal Nov 09 '18 at 10:46
  • 1
    @Nelfeal, OP wants to say that for each query range [L,R], he can give the answer in O(R-L) which could become O(N) in worst case and there are N such queries to answer. Hence, overall it becomes O(N^2). – nice_dev Nov 09 '18 at 10:58
  • 1
    @vivek_23 Oh, I read that as "find `L` and `R` such that [L;R] is the sub-array of maximum length. – Nelfeal Nov 09 '18 at 11:00
  • @Nelfeal never mind. – nice_dev Nov 09 '18 at 11:13

2 Answers2

1

Let A be your binary array.
Build two array IL and IR:
- IL contains, in order, every i such that A[i] = 1 and (i = 0 or A[i-1] = 0);
- IR contains, in order, every i such that A[i-1] = 1 and (i = N or A[i] = 0).

In other words, for any i, the range defined by IL[i] inclusive and IR[i] non-inclusive corresponds to a sequence of 1s in A.

Now, for any query {L, R} (for the range [L; R] inclusive), let S = 0. Traverse both IL and IR with i, until IL[i] >= L. At this point, if IR[i-1] > L, set S = IR[i-1]-L. Continue traversing IL and IR, setting S = max(S, IR[i]-IL[i]), until IR[i] > R. Finally, if IL[i] <= R, set S = max(S, R-IL[i]).

S is now the size of the greatest sequence of 1s in A between L and R.

The complexity of building IL and IR is O(N), and the complexity of answering a query is O(M), with M the length of IL or IR.

Nelfeal
  • 12,593
  • 1
  • 20
  • 39
  • Isn't `M` `N/2` in worst case? The worst case complexity for all queries would still be O(N^2). – merlyn Nov 09 '18 at 11:47
  • `M = N/4` in the worst case, but I'm guessing something like `O(log N)` on average. – Nelfeal Nov 09 '18 at 11:49
  • It really depends on what OP wants exactly, because this kind of algorithm is going to be way faster than any tree-based solution for even large values of `N` (but I guess not asymptotically). – Nelfeal Nov 09 '18 at 11:55
1

Yes, you can use a segment tree to solve this problem.

Let's try to think what that tree must look like. Obviously, every node must contain the length of max subarray of 1s and 0s in that range.

Now, how do we join two nodes into a bigger one. In other words, you have a node representing [low, mid) and a node representing [mid, high). You have to obtain max subarray for [low, high). First things first, max for whole will at least be max for parts. So we have to take the maximum among the left and right values.

But what if the real max subarray overlaps both nodes? Well, then it must be the rightmost part of left node and leftmost part of right node. So, we need to keep track of longest subarray at start and end as well.

Now, how to update these left and rightmost subarray lengths? Well, leftmost of parent node must be leftmost of left child, unless leftmost of left child spans the entire left node. In that case, leftmost of parent node will be leftmost of left + leftmost of right node.

A similar rule applies to tracking the rightmost subarray of 1s.

And we're finished. Here's the final rules in pseudo code.

max_sub[parent] = max(max_sub[left], max_sub[right], right_sub[left] + left_sub[right])
left_sub[parent] = left_sub[left] if left_sub[left] < length[left] else left_sub[left] + left_sub[right]
right_sub[parent] = right_sub[right] if right_sub[right] < length[right] else right_sub[right] + right_sub[left]

Note that you will need to take similar steps when finding the result for a range.

Here's an example tree for the array [0, 1, 1, 0, 1, 1, 1, 0].

An example tree

merlyn
  • 2,273
  • 1
  • 19
  • 26
  • Could you please add a diagram to illustrate? – nice_dev Nov 09 '18 at 12:08
  • 1
    @vivek_23 I have added a diagram. – merlyn Nov 09 '18 at 12:45
  • We can simplify the computation. In `[0, 1, 1, 0, 1, 1, 1, 0]`, we can maintain left sum and right sum array. Left sum array(left to right) is `[0, 1, 2, 0, 1, 2, 3, 0].` and right sum(from right to left) is `[0, 2, 1, 0, 3, 2, 1, 0]`.The tricky part is the `overlapping case`.We can take help of these arrays to get the `1` runs from left to add with `1` runs from right. To get that from left sum for a range `[L,R]`,the `1` runs from left will be `left_sum[R]-left_sum[L-1]`.Same goes for right sum as well. Hence, we don't need to maintain `left` and `right` for each node, just the `max` itself. – nice_dev Nov 10 '18 at 19:38
  • So, now you could take the max for a node as `max(left,right,left_sum[R]-left_sum[L-1] + right_sum[R] - right_sum[L-1])`. Note that `R` and `L` needs to be adjusted for `right_sum` accordingly since it is from right to left. – nice_dev Nov 10 '18 at 19:45
  • 1
    @vivek_23 I think there is a mistake in your code. Instead of L & R, you should take the left run from mid and right run from mid + 1. But the idea is pretty good. Maybe you should add another answer? – merlyn Nov 11 '18 at 02:08
  • 1
    yeah, I meant `L` and `R` for lower nodes. Sorry, I should have mentioned it and yes it will be `mid` for the next higher level. – nice_dev Nov 11 '18 at 13:48