I have been working with the <regex>
library (Microsoft Visual Studio 2012: Update 3), trying to use it to implement a slightly safer loading procedure for my application, and have been having a few teething difficulties (cf. Regular Expression causing Stack Overflow, Concurrently using std::regex, defined behaviour? and ECMAScript Regex for a multilined string).
I have got around my initial troubles (incurring a stack overflow, etc.) by using the regex suggested here, and it has been working well; however, if my file is too big, then it causes a stack overflow (which I circumvented by increasing the stack commit and reserve sizes), or if the stack size is large enough not to cause a stack overflow then it results in a std::regex_error
with error code 12 (error_stack)
.
Here is a self-contained example to replicate the issue:
#include <iostream>
#include <string>
#include <regex>
std::string szTest = "=== TEST1 ===\n<Example1>:Test Data\n<Example2>:More Test Data\n<Example3>:Test\nMultiline\nData\n<Example4>:test_email@test.com\n<Example5>:0123456789\n=== END TEST1 ===\n=== TEST2 ===\n<Example1>:Test Data 2\n<Example2>:More Test Data 2\n<Example3>:Test\nMultiline\nData\n2\n<Example4>:test_email2@test.com\n=== END TEST2 ===\n=== TEST3 ===\n<Example1>:Random Test Data\n<Example 2>:More Random Test Data\n<Example 3>:Some\nMultiline\nRandom\nStuff\n=== END TEST3 ===\n\
=== TEST1 ===\n<Example1>:Test Data (Second)\n<Example2>:Even More Test Data\n<Example3>:0123456431\n=== END TEST1 ===";
int main()
{
static const std::regex regexObject( "=== ([^=]+) ===\\n((?:.|\\n)*)\\n=== END \\1 ===", std::regex_constants::ECMAScript | std::regex_constants::optimize );
for( std::sregex_iterator itObject( szTest.cbegin(), szTest.cend(), regexObject ), end; itObject != end; ++itObject )
{
std::cout << "Type: " << (*itObject)[1].str() << std::endl;
std::cout << "Data: " << (*itObject)[2].str() << std::endl;
std::cout << "-------------------------------------" << std::endl;
}
}
Compiling this with the default stack size (4kB commit and 1MB reserve) will result in a Stack Overflow exception being thrown; and upon changing the stack size (8kB commit and 2MB reserve) it results in a std::regex_error
being thrown with error code 12 (error_stack)
.
Is there anything I can do to prevent these errors, or is it simply that the regex library was designed to be used only with small strings (i.e. DoB checking etc.)?
Thanks in advance!