longest common substring for 2/3 strings : suffix array vs dynamic programming approach

Question

If I want to find the longest common substring for 2 strings then which approach will be more efficient in terms of time/space complexity: using suffix arrays of DP?

DP will incur O(m*n) space with O(m*n) time complexity, what will be the time complexity of the suffix array approach?

1) Calculate the suffixes O(m) + O(n) 2) Sort them O(m+n log2(m+n)) 3) Finding longest common prefix for m+n-1 strings? [I'm not sure how to calculate #of comparisons]

Suffix arrays allow us to do many more things with the sub-strings (like search for sub-string etc.), but since in this case rest of the functions are not needed, will DP be considered an easier/cleaner approach?Which one should be used in the case where we are comparing 2 strings?

Also, what if we have more than 2 strings?

score 0 · Answer 1 · answered Jun 17 '13 at 14:56

Suffix array would be better. The LCS(longest common substring for n strings) problem can be solve as below:

Concatenate S1, S2, ..., Sn as follows: S = S1$1S2$2...$nSn, Here $i are special symbols (sentinels) that are different and lexicographically less than other symbols of the initial alphabet.
Compute the suffix array. Generally, We implemented suffix array in O(n*log n) but there is an important algorithm called DC3 which computes suffix arrays in O(n), n is the total length of N strings. You can google this algorithm.
Compute the LCP of all adjacent suffixes.

Can you please explain little more. I could do this with two strings and dc3 suffix array and lcp array. There I can check the largest lcp value such that i and i-1 should point to different strings. For 10 such strings, do I need to check continuous 10 such values in lcp array that each of them should point to different string? — Naman, Jun 13 '15 at 23:42

longest common substring for 2/3 strings : suffix array vs dynamic programming approach

1 Answers1