0

Given a string S, consisting of the lowercase Latin letters. I want to find for each position S[i] max length L[i] for which there exists a position i' < i that s[i'..i'+L[i]-1] = s[i..i+L[i]-1]. For example: s = ababaab, L= {0,0,3,2,1,2,1}. I want to do it for time < O(|S|^2). I guess the problem is solved with the suffix array but how?

rodart
  • 77
  • 4
  • Couple of questions: which programming language are you using? Have you tried something yet? Do you have any idea on how you could possibly solve the problem? – sergico May 13 '12 at 14:20
  • Programming language is not important. For example c/c++. I am interested in algorithm. I have stupid idea that we iterate through all the elements of the array and every time try to find in suffix array longest common prefix start in current position. – rodart May 13 '12 at 14:28

2 Answers2

0

You should look at the ZBlock Algorithm, although this algorithm solve a slightly different problem (where i' is always equal to 0), it runs in O(|S|). You should be able to modify it at your convenience.

Dynamic programming would solve this in O(|S|^2) using a modified version of substring matching but I guess you are not looking for such solution.

Samy Arous
  • 6,794
  • 13
  • 20
0

What you are looking for is called "longest previous factor" and there is indeed a paper by Crochemore and Ilie with two suffix array algorithms to compute this. The good news is that it is that both are linear time. The second algorithm uses the Lcp table and looks to me to be a bit easier.

Dale Gerdemann
  • 739
  • 5
  • 7