2

I'm reading the KMP algorithm on wikipedia. There is one line of code in the "Description of pseudocode for the table-building algorithm" section that confuses me: let cnd ← T[cnd]

It has a comment: (second case: it doesn't, but we can fall back), I know we can fall back, but why T[cnd], is there a reason? Because it really confuses me.

Here is the complete pseudocode fot the table-building algorithm:

algorithm kmp_table:
    input:
        an array of characters, W (the word to be analyzed)
        an array of integers, T (the table to be filled)
    output:
        nothing (but during operation, it populates the table)

    define variables:
        an integer, pos ← 2 (the current position we are computing in T)
        an integer, cnd ← 0 (the zero-based index in W of the next 
character of the current candidate substring)

    (the first few values are fixed but different from what the algorithm 
might suggest)
    let T[0] ← -1, T[1] ← 0

    while pos < length(W) do
        (first case: the substring continues)
        if W[pos - 1] = W[cnd] then
            let cnd ← cnd + 1, T[pos] ← cnd, pos ← pos + 1

        (second case: it doesn't, but we can fall back)
        else if cnd > 0 then
            let cnd ← T[cnd]

        (third case: we have run out of candidates.  Note cnd = 0)
        else
            let T[pos] ← 0, pos ← pos + 1
zyy7259
  • 525
  • 1
  • 6
  • 15

1 Answers1

1

You can fall back to T[cnd] because it contains the length of the previous longest proper prefix of the pattern W which is also the proper suffix of W[0...cnd]. So if the current character at W[pos-1] matches the character at W[T[cnd]], you may extend the length of longest proper prefix of W[0...pos-1] (which is the first case).

I guess it's kind of like dynamic programming where you rely on previously computed values.

This explanation might help you.

kintoki
  • 318
  • 2
  • 11