1

I thought the boost regex engines would be faster than boost::algorithm
This simple test shows algo beating the regex engines by a wide margin
This is the entire test program
Did I miss something?

#include "boost/algorithm/string.hpp"
#include "boost/regex.hpp"
#include "boost/xpressive/xpressive.hpp"
#include "boost/progress.hpp"
#include <iostream>

int main()
{
    boost::timer tm;
    const int ITERATIONS = 10000000;
    {
        std::string input("This is his face");
        tm.restart();
        for( int i = 0; i < ITERATIONS; ++i)
        {
            boost::algorithm::replace_all(input,"his","her");
        }
        std::cout << "boost::algorithm: " << tm.elapsed()/60 << std::endl;
    }

    {
        std::string input("This is his face");
        boost::regex expr("his");
        std::string format("her");
        tm.restart();
        for( int i = 0; i < ITERATIONS; ++i)
        {
            boost::regex_replace( input, expr, format );
        }
        std::cout << "boost::regex: " << tm.elapsed()/60 << std::endl;
    }

    {
        std::string input("This is his face");
        boost::xpressive::sregex expr = boost::xpressive::as_xpr("his");
        std::string format("her");
        tm.restart();
        for( int i = 0; i < ITERATIONS; ++i)
        {
            boost::xpressive::regex_replace(input, expr, format);
        }
        std::cout << "boost::xpressive: " << tm.elapsed()/60 << std::endl;
    }

    return 0;
}
user754425
  • 437
  • 1
  • 4
  • 10

2 Answers2

3

regex can handle all kinds of regular expression (for example something like "My.*Test" can be matched in a text like "I wonder how many classes called MySumTest have been written?"). They are more powerful but less performant than algorithms for finding a pattern in a text

b.buchhold
  • 3,837
  • 2
  • 24
  • 33
3

I don't find this all that surprising; simple things usually are faster. In higher level languages, say JavaScript, it's usually a win to delegate string processing down to a regular expression because there's so much overhead even doing a simple loop in an interpreted language, but the same reasoning doesn't apply to compiled languages like C++.

Anyway, I would say you should use boost string algorithms over regex where it is reasonable to do so, because boost::regex introduces a runtime dependency (it uses an external .so file) while the algorithms are basically inline code generators, and you should use regexes only where you need them... say looking for an floating point number:

[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?

would you want to try that without regular expressions?

olooney
  • 2,467
  • 1
  • 16
  • 25