0

How can I do the next operation, with a suffix tree of string s, who's number of vertices is bounded by O(|s|):

Is-k-Sub-string(r) - checks if string r is a k-sub-string of s, where a k-sub-string defined as follow:

A sub-string r of s defined as k-sub-string if there is a partition of s to sub-strings for which:

r=x1x2...xk; xi = sub-string of s.

Example: s = whitething, r = within, r is 3-sub-string of s.

I need that operation to work in complexity of O(|r|).

I don't understand how to do that on O(|r|), because each character in r can be the current delimiter, for example with 2-Sub-string, so for that I must to try all the possible characters as delimiters between x1 and x2 (for the partition r=x1x2).

Any ideas ?

Henry
  • 2,953
  • 2
  • 21
  • 34
Avenger
  • 41
  • 4

1 Answers1

0

Lemma: If A is a suffix of B and B can be split into at most k substrings of S, so can A.

Proof: let B = x[1] x[2] x[3] ... x[k]. Let's throw away the first |B| - |A| characters of the partition. We'll get a partition with no more than k parts.

Corollary: if we have a fixed partition of some prefix of R, there's an optimal partition in which the next substring is the longest one we can take.

The solution immediately follows from the statements proven above. We can make each part as long as we can:

pos = 0
while pos < R.length:
      take the longest prefix of R[pos:] that is a substring of R
      move pos after the end of this substring

This solution can be implemented in linear time: we can just start from the root of the tree and keep going as long their a transition for the current character of R. If there's none, we add the current part to the answer and restart from the root.

kraskevich
  • 18,368
  • 4
  • 33
  • 45