I do not know about other browsers, but google chrome uses the Boyer Moore search algorithm to find words on a webpage. In this algorithm, the browser scans the word you have entered from right to left.
The string to be searched for is called P
, which is called "Pattern".
The string we are searching within is called T
, or "Test".
The length of T
and P
are generally represented by m
and n
respectively. The advantage of this algorithm is that instead of using brute force for searching (which would have taken m - n - 1
trials), it preprocesses P
and skips as many possibilities as possible.
According to Wikipedia:
The key insight in this algorithm is that if the end of the pattern is
compared to the text, then jumps along the text can be made rather
than checking every character of the text. The reason that this works
is that in lining up the pattern against the text, the last character
of the pattern is compared to the character in the text. If the
characters do not match, there is no need to continue searching
backwards along the text. If the character in the text does not match
any of the characters in the pattern, then the next character in the
text to check is located n characters farther along the text, where n
is the length of the pattern. If the character in the text is in the
pattern, then a partial shift of the pattern along the text is done to
line up along the matching character and the process is repeated.
Jumping along the text to make comparisons rather than checking every
character in the text decreases the number of comparisons that have to
be made, which is the key to the efficiency of the algorithm.
Boyer-Moore algorithm employs two approaches:
- Bad character Heuristic
- Good Suffix Heuristic
P
is processed and different arrays for both heuristics are formed.
The character of T
which doesn’t match with the current character (of P
) is called the Bad Character.
A good suffix happens when a substring of T
has been successfully matched with a substring of P
.
In both these methods or Heuristics, several rules are followed, which you can read in detail here and here. There is no point in copy-pasting articles from different websites.