Stuck on how to make this palindrome pairs finding function be less than O(N^2)

Question

Task here is to find palindromes formed by combining pairs of the original word list (so for below, stuff like "callac" and "laccal"). I thought I had an idea of pre-computing all the reversed versions of each word (N). Then for each original word, compare with each reversed word... but then we're back to N*N.

Maybe we could sort the list. And then for each word we're working on, do a binary search for words whose first character matches our word's first or last character and check from there resulting in some kind of N log N situation?

My code:

data = ["cal", "lac", "mic", "blah", "totally", "ylla", "rum", "mur", "listen", "netsil"]

def get_palindrome_pairs(words):
    palindromes = []
    for i in range(len(words)):
        for j in range(i+1, len(words)):
            combined = words[i] + words[j]
            combined_swapped = words[j] + words[i]
            if combined == combined[::-1]:
                palindromes.append(combined)
            if combined_swapped == combined_swapped[::-1]:
                palindromes.append(combined_swapped)

    return palindromes

print(get_palindrome_pairs(data))

What are the constraints on ```N``` and the length of words? — Abhinav Mathur, Mar 10 '21 at 04:47
I don't have any specifically, but let's assume large values of both such that optimizing is worthwhile. — Aaron, Mar 10 '21 at 17:50

Mark Saving · Answer 1 · 2021-03-10T04:51:27.637

Edit: my original algorithm did not work. The situation is, in fact, more complicated than this because you might have a ["abc", "ba"] situation where the combination gives you "abcba", which is a palindrome. You need to use a more sophisticated approach to account for this.

First, note that one can use a modified KMP algorithm to find, for any strings s1 and s2, all prefixes of (or, to be more precise, the lengths of all prefixes of) s1 which are suffixes of s2. This can be done in linear time. The basic version of KMP can be easily modified to find the longest prefix of s1 which is a suffix of s2; one can then use the backtracking table to find all the others.

Second, note that we can use this to find all palindromic prefixes of a string s in linear time. This is because a palindromic prefix of s is precisely a prefix of s which is a suffix of reversed(s).

Now suppose we have strings s and k, and length(s) >= length(k). Then ks is a palindrome iff we can write s = z reversed(k), where z is a palindrome.

To determine whether we have any s, k such that length(s) >= length(k) and ks is a palindrome, store all strings in a trie. Then, for each string s, find all its palindromic prefixes. Walk through s backwards to find all k in the trie such that reversed(k) is a suffix of s. If there is any such k, check to see if the corresponding prefix is a palindrome. If so, we have found k and s such that ks is a palindrome.

To determine whether we have any s, k such that length(s) >= length(k) and sk is a palindrome, it suffices to repeat the above step, but first reversing all the strings.

The total runtime will be O(k + n) where k is the total number of characters across all strings and n is the number of strings.

That's an astute pickup. Also longer possibilities like `["ylla","totally"]` - looks like you'll need to consider each words set of prefix-palindromes and suffix-palindromes (more specifically - the non-palindromic other part of the word), as you can make a match on pairs like `[ {palindrome}{rest}, {rest_reversed} ]` and also `[ {rest}{palindrome}, {rest_reversed} ]`. Maybe some sort of preprocessing that finds all of the palindrome-complement prefix/suffixes to also put those in the trie, since that what's can be matched? — moreON, Mar 10 '21 at 03:41
@moreOn That's what I came up with on paper. I think I've found a nice way to use KMP to find all palindromic prefixes of a string in linear time. — Mark Saving, Mar 10 '21 at 04:41

Stuck on how to make this palindrome pairs finding function be less than O(N^2)

1 Answers1