2

I am trying to solve a variant of a problem I asked earlier:

Given a string of parentheses (length <= 1,000,000) and a list of range queries, find the longest subsequence of balanced parentheses within each of the ranges for each of the <= 100,000 queries

I found this other SO question that is similar but only has an O(N^3) algorithm.

I believe that a DP solution of the form dp[i, j] = longest balanced subsequence in [i .. j] should work because once computed, this would enable to to answer all of the range queries just by querying the DP table. However, even an O(N^2) solution to this problem would exceed the time limits due to the large possible input string length.

Further, the trick of using a stack to keep track of matching parentheses no longer directly works because you are looking for subsequences, not substrings.

I have a method which I think might work but am not sure of:

The length of the longest subsequence of balanced parentheses within an interval is the sum of the lengths of the longest non-overlapping substrings of balanced parentheses within that interval.

For example, if you have the string

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

) ) ( ( ) ( ( ) ) ) ) ( ( ) )

The length of the longest subsequence of balanced parentheses in the interval [0, 8] (inclusive) is 5. This length is equal to that of the sum of the lengths of the longest non-overlapping substrings within the interval: "( )" + "( ( ) )".

Will this method always hold, or is there a better way?

Community
  • 1
  • 1
1110101001
  • 4,662
  • 7
  • 26
  • 48
  • does this problem exist in some online judge system? – Herokiller Oct 30 '14 at 02:36
  • *Will this method always hold, or is there a better way?* Yes, and yes. There are many possible maximal subsequences, and pretty much any reasonable way of generating one will work. In fact there is an easy `O(n)` answer. But I won't ruin whatever competition you got it from by giving it to you. – btilly Oct 30 '14 at 02:49
  • @Herokiller This problem is a variant I thought of based on this past contest problem: http://www.usaco.org/index.php?page=viewproblem2&cpid=194 – 1110101001 Oct 30 '14 at 04:17
  • @user2612743 I think we can make O(n) dynamics here, but cannot check without actual tests – Herokiller Oct 30 '14 at 04:24
  • @btilly Does the O(n) solution involve using a stack and then when encountering a matching pair, popping off the stack? – 1110101001 Oct 30 '14 at 04:25
  • @btilly Also, if is O(n) for each query with a range length of n, then the worst case is that each of the `100000` queries are for the full range of length `1000000` which is not optimized enough. There should be a way for each query to be log(N) or better... – 1110101001 Oct 30 '14 at 04:31
  • http://codeforces.com/contest/5/problem/C – Rezwan Arefin Feb 26 '17 at 21:41

2 Answers2

6

Since someone else posted an answer, here is a O(n) answer to the single query with O(1) space. Keep the count of parens balanced and pointers to the last open and last closed paren. Until you've run off the string, scan forward on the last open to find another open paren. Then scan forward from the max of the last open and last closed paren to find the next closed paren. If you find a pair that way, increment the count of parens balanced. When you reach the end of the string, you will have the correct count, even though you paired up the parens incorrectly.

There may actually be multiple maximal subsequences of balanced parens. But if you take any maximal subsequence of balanced parens, and replace every open paren with the left-most possible open paren, and then every close paren with the left-most possible open parens, the result will be the ones you found. (Proof left as an instructive exercise to the reader.)

Here is pseudo-code.

parens = 0
last_open = 0
last_closed = 0
while last_open < len(str) && last_closed < len(str):
    if str[last_open] == ')':
        # We are looking for the next open paren.
       last_open += 1
    elif last_closed < last_open:
       # Start our search for a last closed after the current char
       last_closed = last_open + 1
    elif str[last_closed] == '(':
       # still looking for a close pair
       last_closed += 1
    else:
       # We found a matching pair.
       parens += 1
       last_open += 1
# and now parens has the correct answer.

And next we have the challenge of many range queries. It turns out that making this fast takes O(n) precomputation and O(n) space, and each range query will be O(log(n)) time.

Here is the hint for that problem. Suppose that we have 2 blocks A and B right next to each other. Each of which internally has some number of balanced subsequence of parens, some number of additional open parens available to the right, and some number of additional close parens available to the left. Then the combined block C has the following:

C.balanced = A.balanced + B.balanced + min(A.open, B.close)
C.open = B.open + max(A.open - B.close, 0)
C.close = A.close + max(B.close - A.open, 0)

I leave to you the exercise of figuring out what set of blocks to precompute to be able to compute any block in time O(log(n)).

btilly
  • 43,296
  • 3
  • 59
  • 88
  • If you want to repeat this efficiently for multiple range queries, could you somehow incorporate the method prefix sums into this? Or else create a hashmap that maps last_open and last_closed to a value of parens, allowing for range queries? – 1110101001 Oct 30 '14 at 06:20
  • @user2612743 I added a hint for how to adjust this method to many efficient range queries. – btilly Oct 30 '14 at 07:06
  • You should precompute in blocks of size two right? That way given any range you can keep dividing in half until you get a block of size two, and combine the various blocks of size two to get a block of size 4, 8, and so on until you get the full range. – 1110101001 Oct 31 '14 at 00:00
  • If you run the algorithm on the string "(()" you get an answer of 2 for the number of sets of matching parentheses, when it should only give 1 – 1110101001 Oct 31 '14 at 05:53
  • @user2612743 How so? Whether I view it as `((`+`)` or `(`+`()`, `(()` gives me open 1, balanced 1, close 0. Which is as it should be. – btilly Nov 02 '14 at 05:20
  • I came up with the same idea for single query and it works for quite a few examples. But I don't have a proof. Any intuition on why this finds the answer? – sinoTrinity May 07 '15 at 02:53
  • @sinoTrinity The intuition is that this finds the maximal set of matched parens. And a set of matched parens is always a set of balanced parens, with the only difference being which parens are matched to each other. – btilly May 07 '15 at 09:54
2

I will describe an O(n) solution

First, we have an dp[n] array, for each position i , if i is the close bracket ), dp[i] will store the farthest position, which make a valid sequence of parentheses that ends at i.

We maintain a stack , which keep track of open brackets and their position. So if we encounter an open bracket, we put it into the stack together with its location, and if we encounter an close bracket, we pop the last open bracket out, and update the dp array.

  • dp[i] = min (position of open bracket, dp[position of open bracket - 1] ), this will check if before the open bracket, is there an close bracket, if yes, we improve the dp[i]

So, the answer will be the largest value between i - dp[i]

Java code:

    public static void main(String[] args) {
    String val = "))(()(())))(())";// Which store the parentheses sequence
    int[] dp = new int[val.length()];
    Arrays.fill(dp, -1);

    Stack<Integer> stack = new Stack();
    for (int i = 0; i < val.length(); i++) {
        char c = val.charAt(i);
        if (c == '(')
            stack.push(i);
        else if (!stack.isEmpty()) {
            int v = stack.pop();
            dp[i] = v;
            if (v > 0 && val.charAt(v - 1) == ')')
                if (dp[v - 1] != -1)
                    dp[i] = dp[v - 1];
        }
    }
    int result = 0;
    for (int i = 0; i < val.length(); i++){
        if (dp[i] != -1){
            System.out.println(val.substring(dp[i] , i + 1));

            result = Math.max(result, i - dp[i] + 1);
        }
    }
    System.out.println(result);
}

Out put:

()
()
()(())
(()(()))
()
(())
8
Pham Trung
  • 11,204
  • 2
  • 24
  • 43
  • This is O(n) for each range query though. If you have N range queries each of interval length N, this becomes O(N^2) which is too large. Is there any way to get O(lg N) or below for each query to get a total runtime of O(N lgN) or better when you have multiple queries? – 1110101001 Oct 30 '14 at 06:10
  • @user2612743 hmm, I think in your link, there is one person that answered your question before, so from the result of my algorithm, you need to build a tree structure of valid parentheses, and query on that tree, which can improve the performance! – Pham Trung Oct 30 '14 at 06:28
  • Doesn't the above algorithm only provide the longest *substring* of balanced parentheses, not *subsequence*? – 1110101001 Oct 30 '14 at 06:37
  • @user2612743 hmm, you are correct, this is not subsequence, will need to modify my answer – Pham Trung Oct 30 '14 at 06:57
  • Stack is probably inapplicable when dealing with subsequence, not substring. – sinoTrinity May 07 '15 at 02:49