Comparison of C++17 string search algorithms

Question

C++17 added specialized string search algorithms:

To quote wikipedia on the Boyer–Moore–Horspool algorithm:

It is a simplification of the Boyer–Moore string search algorithm which is related to the Knuth–Morris–Pratt algorithm. The algorithm trades space for time in order to obtain an average-case complexity of O(n) on random text, although it has O(nm) in the worst case, where the length of the pattern is m and the length of the search string is n.

Questions:

Besides measuring, are there any guidelines on which to decide what is best?
When should I use std::boyer_moore_horspool_searcher over std::boyer_moore_searcher?
How does std::default_searcher match up against those two algorithms? Are there any standard library implementations that do not implement it with a naive string comparison (optimized for small strings)?

Conclusion:

As at least the question about std::default_searcher is not covered by the marked duplicate (Which is a better string searching algorithm? Boyer-Moore or Boyer Moore Horspool?) and there seems to be some interest in the answer, for the sake of completeness here is my attempt:

Question: How does std::default_searcher match up against those two algorithms? Are there any standard library implementations that do not implement it with a naive string comparison (optimized for small strings)?

According to the documentation on cppreference, std::default_searcher delegates to the existing pre-C++17 std::search function.

I think it is not enforced by the standard but I would be surprised if this function is not always implemented as a brute-force search. Empirically, at least the standard library shipped on Linux does in fact implement it as a brute-force search (implemented in __search in stl_algo.h).

Question: Besides measuring, are there any guidelines on which to decide what is best? When should I use std::boyer_moore_horspool_searcher over std::boyer_moore_searcher?

No, there is no good rule of thumb. It depends too much on the input.

Try both and measure which is faster.

Given that the other question has an answer that has only pointers and asks the person to do more research, and given that an exact answer to this question would be useful to many and interesting, I vote to reopen this question. — user1952500, Jan 29 '17 at 21:23
As someone who could hammer this question open again, I don't think that the answer to the dupe target is missing anything. — David Eisenstat, Jan 29 '17 at 23:37
For the complexity case, I would say that the answer**s** to this question does a much better job in explanation: http://stackoverflow.com/questions/18652892/difference-between-original-boyer-moore-and-boyer-moore-horspool-algorithm However, this question also asks about a few implementation details and I am interested in the answer to that as well. — user1952500, Jan 30 '17 at 04:55

Comparison of C++17 string search algorithms

0 Answers0