1

ThreadSanitizer gives me an alleged race condition in boost::regex_match. Is this a false positive? I cannot find synchronization primitives that depend on BOOST_HAS_THREADS in the callstacks. All input parameters are on the stack of the respective thread and not shared.

==================
WARNING: ThreadSanitizer: data race (pid=1893)
  Write of size 4 at 0x007e19fa8ff0 by thread T36:
    #0 boost::re_detail_106700::saved_state::saved_state(unsigned int) include/boost/regex/v4/perl_matcher_non_recursive.hpp:59 
    #1 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::push_recursion_stopper() include/boost/regex/v4/perl_matcher_non_recursive.hpp:288
    #2 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_all_states() include/boost/regex/v4/perl_matcher_non_recursive.hpp:202
    #3 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_prefix() include/boost/regex/v4/perl_matcher_common.hpp:336
    #4 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_imp() include/boost/regex/v4/perl_matcher_common.hpp:220
    #5 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match() include/boost/regex/v4/perl_matcher_common.hpp:193
    #6 bool boost::regex_match<char const*, std::allocator<boost::sub_match<char const*> >, char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >(char const*, char const*, boost::match_results<char const*, std::allocator<boost::sub_match<char const*> > >&, boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, boost::regex_constants::_match_flags) include/boost/regex/v4/regex_match.hpp:50
    #7 bool boost::regex_match<char, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >(char const*, boost::match_results<char const*, std::allocator<boost::sub_match<char const*> > >&, boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, boost::regex_constants::_match_flags) /var/lib/jenkins/workspace/nightly-jnd-navigation__tsd-nav-rsi-viwi-dev/system/ext-boost-dev/dist/17-89ad-bc06/usr/include/boost/regex/v4/regex_match.hpp:73 (tsd.nav.mainapp.mib3+0x3dd0610)
<...>

  Previous write of size 4 at 0x007e19fa8ff0 by thread T105:
    [failed to restore the stack]

  Location is heap block of size 4096 at 0x007e19fa8000 allocated by thread T105:
    #0 operator new(unsigned long) <null> (libtsan.so.0+0x79f54)
    #1 boost::re_detail_106700::save_state_init::save_state_init(boost::re_detail_106700::saved_state**, boost::re_detail_106700::saved_state**) include/boost/regex/v4/perl_matcher_non_recursive.hpp:107
    #2 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_imp() include/boost/regex/v4/perl_matcher_common.hpp:202
    #3 boost::re_detail_106700::perl_matcher<char const*, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match() include/boost/regex/v4/perl_matcher_common.hpp:193
    #4 bool boost::regex_match<char const*, std::allocator<boost::sub_match<char const*> >, char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >(char const*, char const*, boost::match_results<char const*, std::allocator<boost::sub_match<char const*> > >&, boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, boost::regex_constants::_match_flags) include/boost/regex/v4/regex_match.hpp:50
    #5 bool boost::regex_match<char, std::allocator<boost::sub_match<char const*> >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >(char const*, boost::match_results<char const*, std::allocator<boost::sub_match<char const*> > >&, boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, boost::regex_constants::_match_flags) include/boost/regex/v4/regex_match.hpp:73

<...>

Regards

sehe
  • 374,641
  • 47
  • 450
  • 633
Desperado17
  • 835
  • 6
  • 12
  • It's always good to include a reproducing example. In this case we could verify that you are doing the things that are required (per my answer). We could even establish if there were an library bug. – sehe Dec 09 '22 at 11:50

1 Answers1

1

I think the documentation is pretty definitive:

Thread Safety

The Boost.Regex library is thread safe when Boost is: you can verify that Boost is in thread safe mode by checking to see if BOOST_HAS_THREADS is defined: this macro is set automatically by the config system when threading support is turned on in your compiler.

Class basic_regex and its typedefs regex and wregex are thread safe, in that compiled regular expressions can safely be shared between threads. The matching algorithms regex_match, regex_search, and regex_replace are all re-entrant and thread safe. Class match_results is now thread safe, in that the results of a match can be safely copied from one thread to another (for example one thread may find matches and push match_results instances onto a queue, while another thread pops them off the other end), otherwise use a separate instance of match_results per thread.

The POSIX API functions are all re-entrant and thread safe, regular expressions compiled with regcomp can also be shared between threads.

The class RegEx is only thread safe if each thread gets its own RegEx instance (apartment threading) - this is a consequence of RegEx handling both compiling and matching regular expressions.

Finally note that changing the global locale invalidates all compiled regular expressions, therefore calling set_locale from one thread while another uses regular expressions will produce unpredictable results.

There is also a requirement that there is only one thread executing prior to the start of main().

So, you need to make sure:

  • you are not sharing the match_results object (your description doesn't say, because that's not an input argument depending on your definition)

  • the regex is pre-compiled:

    [are] thread safe, in that compiled regular expressions can safely be shared between threads

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Is the second paragraph always true, even if BOOST_HAS_THREADS is not set? – Desperado17 Dec 09 '22 at 19:16
  • No, but I think BOOST_HAS_THREADS is always set, for all relevant compilers. The only way to not get it is by explicitly define BOOST_DISABLE_THREADS. It's also easy to static-assert or even preprocessor error if it isn't set. – sehe Dec 09 '22 at 20:57