2

The Two Way algorithm is a substring search algorithm (primary paper, 1.4 MB PDF).

It splits the search pattern x in two parts: x = xl xr, and first it tries to match xr against the text, and if that is successful the algorithm prescribes matching xl in reverse (i.e. right-to-left order).

  • Why is xl matched from right to left?
  • Can I replace this with a left-to-right comparison instead?

The reason for the question is simple: An order unspecified comparison is already available and possibly more performant, think something like an optimized memcmp or unrolled loop.

bluss
  • 12,472
  • 1
  • 49
  • 48
  • How would you match it from left to right when you're looking for a suffix? – biziclop Jul 31 '15 at 14:51
  • You're looking for the first occurence of a substring *x* in a text, so the global search is from “left to right” (memory order), but that half of the pattern is matched starting with the rightmost byte. – bluss Jul 31 '15 at 15:41

1 Answers1

1

From an efficiency point of view it obviously doesn't matter. The only other reason I can think of is: in case of non-matching, a right-to-left attempt potentially leaves the algorithm with more information about a partial match. So going RTL, if we match 2 characters in xl and then fail, we know we have a partial, contiguous match of 2 chars + xr. If we match xl LTR and fail, we know nothing more than the xr match.