Suppose you have a string Hello World!!!
and you want to search for Head Up
.
Hello World!!!
Head Up
^
When you are in the first and second character the first condition apply (first case: the substring continues)
, in the case of the marked position, the character don't match but you are already inside the a sub-string match (2 character matched until there), this case correspond to the second conditional (second case: it doesn't, but we can fall back)
. The third case would be when you miss match the first character of the pattern.
The second condition is necessary because you could use the information of the matched character until the miss match, to avoid unnecessary comparison that you could already known the result (skip characters of the string
that you already know that the beginning part of the pattern would don't match).
Example: With string HeHello World!!!
and searching for Hello
HeHello World!!!
Hello
^ when you miss match this character using the table of KMP you known that
could skip 2 characters because
HeHello World!!!
Hello
^ this would miss match
In the case of building the pattern table for pattern HeHello
. Suppose ^
is cnd
and *
is pos
. The start point is pos = 2
and cnd = 0
(but when checking in the pattern is with pos - 1 = 1
).
HeHeHello T [-1,0,0,0,0,0,0,0,0]
^* comparing 0 with 1 go to condition 3 cnd = 0, pos = 2
_
HeHeHello T [-1,0,0,1,0,0,0,0,0]
^ * comparing 0 with 2 go to condition 1 cnd = 0, pos = 3
_
HeHeHello T [-1,0,0,1,2,0,0,0,0]
^ * comparing 1 with 3 go to condition 1 cnd = 1, pos = 4
_
HeHeHello T [-1,0,0,1,2,3,0,0,0]
^ * comparing 2 with 4 go to condition 1 cnd = 2, pos = 5
_
HeHeHello T [-1,0,0,1,2,3,4,0,0]
^ * comparing 3 with 5 go to condition 1 cnd = 3, pos = 6
HeHeHello T [-1,0,0,1,2,3,4,0,0]
^ * comparing 4 with 6 go to condition 2 (cnd = T[cnd], cnd = T[4] = 2)
HeHeHello T [-1,0,0,1,2,3,4,0,0]
^ * comparing 2 with 6 go to condition 2 (cnd = T[cnd], cnd = T[2] = 0)
...