C++17 added specialized string search algorithms:
To quote wikipedia on the Boyer–Moore–Horspool algorithm:
It is a simplification of the Boyer–Moore string search algorithm which is related to the Knuth–Morris–Pratt algorithm. The algorithm trades space for time in order to obtain an average-case complexity of O(n) on random text, although it has O(nm) in the worst case, where the length of the pattern is m and the length of the search string is n.
Questions:
- Besides measuring, are there any guidelines on which to decide what is best?
- When should I use std::boyer_moore_horspool_searcher over std::boyer_moore_searcher?
- How does std::default_searcher match up against those two algorithms? Are there any standard library implementations that do not implement it with a naive string comparison (optimized for small strings)?
Conclusion:
As at least the question about std::default_searcher
is not covered by the marked duplicate (Which is a better string searching algorithm? Boyer-Moore or Boyer Moore Horspool?) and there seems to be some interest in the answer, for the sake of completeness here is my attempt:
Question: How does std::default_searcher match up against those two algorithms? Are there any standard library implementations that do not implement it with a naive string comparison (optimized for small strings)?
According to the documentation on cppreference, std::default_searcher
delegates to the existing pre-C++17 std::search
function.
I think it is not enforced by the standard but I would be surprised if this function is not always implemented as a brute-force search. Empirically, at least the standard library shipped on Linux does in fact implement it as a brute-force search (implemented in __search
in stl_algo.h
).
Question: Besides measuring, are there any guidelines on which to decide what is best? When should I use std::boyer_moore_horspool_searcher over std::boyer_moore_searcher?
No, there is no good rule of thumb. It depends too much on the input.
Try both and measure which is faster.