KMP algorithm for string matching.
Following is the code I found online for computing the longest prefix-suffix array:
Defination:
lps[i] = the longest proper prefix of pat[0..i]
which is also a suffix of pat[0..i].
Code:
void computeLPSArray(char *pat, int M, int *lps)
{
int len = 0; // length of the previous longest prefix suffix
int i;
lps[0] = 0; // lps[0] is always 0
i = 1;
// the loop calculates lps[i] for i = 1 to M-1
while(i < M)
{
if(pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
if( len != 0 )
{
// This is tricky. Consider the example AAACAAAA and i = 7.
len = lps[len-1]; //*****************
// Also, note that we do not increment i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
Can I use len = len-1
instead of len = lps[len-1]
?
because len always counts the prefix length like from [0 .. someIndex]. Then why use lps for assignment here? Following are the cases for which I tested which work fine(first line is the pattern and subsequent two lines are the result for original and modified assignment to len
) :
a a a b a b c
0 1 2 0 1 0 0
0 1 2 0 1 0 0
a b c b a b c
0 0 0 0 1 2 3
0 0 0 0 1 2 3
a a b c b a b
0 1 0 0 0 1 0
0 1 0 0 0 1 0
Code here with both variations written : http://ideone.com/qiSrUo