0

I am getting the above error using the pattern

"\b([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})\b"

(For email checking)

whenever the search content's length is longer than 17 characters.

I have tried using boost 1.42 and 1.61 and got the same result.

The platform is AIX 7 with g++ 4.8.5

The c++ test program is very simple like this:

boost::regex e({the above search pattern});
int match = boost::regex_search({my search content}, e);

Please note that the same program does not get the exception on Windows (compiled using MSVC)

From other SO questions, I know it is possible to get this exception. But I don't know if this is platform specific or not.

I also tried using some online regex tester and there is no problem with the same search pattern (i.e. if the search content contains a email, the online tool can find the match)

My question is why the same program with the same search pattern does not fail on Windows.

  • You may use a simpler regex, like `\S+@\S+\.\S+` – Wiktor Stribiżew Oct 27 '17 at 07:39
  • I would like to know if the platform differences DOES exist, or I am doing something wrong (e.g. not building the regex lib correctly on AIX) – Kelvin Cheng Oct 27 '17 at 08:36
  • 1
    This message just tells you that the regular expression is not efficient and you must re-write it to match in a linear way. With `abc12334@gmmmmmaaallllll.com1`, [it causes a catastrophic backtracking issue](https://regex101.com/r/74FYug/1). Change the first part to `[0-9a-zA-Z](?:[-.\w]*[0-9a-zA-Z])?` and [it will fail gracefully](https://regex101.com/r/74FYug/2). – Wiktor Stribiżew Oct 27 '17 at 08:40
  • Regarding the part where you've mentioned that it works on windows, please note that boost.regex defines `max_state_count` constant that's indeed platform dependent and is checked to trigger your error. See estimate_max_state_count in `perl_matcher_common.hpp` and notice `std::numeric_limits::max`. – yuyoyuppe Oct 27 '17 at 08:52
  • yuyoyuppe, the std::numeric_limits::maxs 268598104, while the BOOST_MAX_STATE_COUNT is 100000000. So what does this means ? – Kelvin Cheng Oct 27 '17 at 09:02
  • 1
    @Wiktor: after testing the same program on linux, aix and windows, I confirm that exception will not be thrown on Windows only (with the u32 version of regex, using ICU) . Both Linux and AIX has the same issue. Unfortunately, the regular expression is not written by me. My job is to port the program to AIX. I agree that the expression has its own problem. However, the current system allows users to input regular expressions but lack a feedback UI to alert users of potential issues with the expression. But that's another story. Thanks Wiktor and thank for pointing me to the regex101 website. – Kelvin Cheng Oct 31 '17 at 04:35

0 Answers0