4

I am trying to solve this question on LeetCode:

A string s is nice if, for every letter of the alphabet that s contains, it appears both in uppercase and lowercase. For example, "abABB" is nice because 'A' and 'a' appear, and 'B' and 'b' appear. However, "abA" is not because 'b' appears, but 'B' does not.
Given a string s, return the longest substring of s that is nice. If there are multiple, return the substring of the earliest occurrence. If there are none, return an empty string.
For s = "YazaAay", the expected output is: "aAa"

One of the top voted solutions uses a Divide and Conquer approach:

class Solution {
    public String longestNiceSubstring(String s) {
        if (s.length() < 2) return "";
        char[] arr = s.toCharArray();
        Set<Character> set = new HashSet<>();
        for (char c: arr) set.add(c);
        for (int i = 0; i < arr.length; i++) {
            char c = arr[i];
            if (set.contains(Character.toUpperCase(c)) && set.contains(Character.toLowerCase(c))) continue;
            String sub1 = longestNiceSubstring(s.substring(0, i));
            String sub2 = longestNiceSubstring(s.substring(i+1));
            return sub1.length() >= sub2.length() ? sub1 : sub2;
        }
        return s; 
    }
}

I understand how it works, but not the intuition behind using a Divide and Conquer approach. In other words, if I revisit the problem again after a few days/weeks after I have forgotten everything about it, I won't be able to realize it is a Divide and Conquer problem.

What is that 'thing' that makes it solvable by a Divide and Conquer approach?

P.K.
  • 379
  • 1
  • 4
  • 16
  • In general divide and conquer is a great fit for complex scenarios, because you can easily run the code on multiple threads. (btw.: the recursive method invocation in this code example looks evil, I think this could lead to StackOverflowError, if the input is too big) – Benjamin M Mar 03 '21 at 17:22
  • @BenjaminM, yes, in this case the problem constraints say that the limits are not too high: `1 <= s.length <= 100`. – P.K. Mar 03 '21 at 17:25

2 Answers2

4

This is how the algorithm could be described in plain English:

If the entire string is nice, we are done.

Otherwise, there must be a character which exists in only one case. Such a character naturally divides the string into two substrings. Conquer each of them individually, and compare results.

Edit: BTW, I don't think it is a good example of D&C problem. The point is, once we encounter the first "bad" character, the substring to the left of it is nice. There is no need to descend into it. Just record its length and keep going. A simple loop it is.

user58697
  • 7,808
  • 1
  • 14
  • 28
  • It is not necessary that the substring left to the first bad character is always nice. For example: ApzPAa. Here `z` is the first bad character. However, the substring to the left of it is not nice. – AKSingh Mar 03 '21 at 20:14
  • @AKSingh Thanks for catching. I don't know what I was thinking. – user58697 Mar 03 '21 at 20:57
  • No worries. It can happen to the best of us. – AKSingh Mar 03 '21 at 21:03
  • Add a bit of pruning to save some effort: If you have a nice substring, you don't need to check any not-yet-checked substrings that are shorter than that. – Rick James Apr 01 '21 at 00:54
2

Divide-And-Conquer, to paraphrase wikipedia, is most appropriate when a problem can be broken down into "2 or more subproblems". The solution here checks that the input string meets the condition, then breaks it in two at each character, and recursively checks the strings meet the condition until there is no solution. Generally, the application of divide-and-conquer is easy to get a feel for when the problem can be subdivided symmetrically, such as in the DeWall algorithm for computing the delaunay triangulation for a set of points (http://vcg.isti.cnr.it/publications/papers/dewall.pdf - cool stuff).

What sets the substring problem apart in this instance is it checks all (edit:) possible viable subdivisions by incrementing the line of subdivision. To clarify for anyone who might be confused, this is necessary because the string can't be split down the middle, else you might be splitting a substring like "aAaA" apart and returning only half of it in the end. This kind of meets the more condition in "two or more problems", but I agree it's not intuitive in this instance.

Hope this helps, I had to learn about this a lot recently while implementing the referenced algorithm. Someone with more experience might have a better answer.

Optimum
  • 146
  • 2
  • 11
  • It doesn't check __all__ subdivisions. It only checks viable ones. – user58697 Mar 03 '21 at 17:44
  • I was referring to this line: `for (int i = 0; i < arr.length; i++)` - which, sure, checks all subdividing *points* if you want to be picky. – Optimum Mar 03 '21 at 19:52
  • ... but it only recurses if that point exists in only a single case. Notice a `continue` in that very long `set.contains(Character.toUpperCase(c)) && set.contains(Character.toLowerCase(c))` test. – user58697 Mar 03 '21 at 19:54
  • whoops, didn't see that (also never touched java), but after some googling I think I see what you're getting at - I'll edit it – Optimum Mar 04 '21 at 15:40