Sequencing and repetition use up stack because they require backtracking. Let's look at your examples.
range('0','6')
This is implemented as a quick check to see if a character falls within a character range. It doesn't use sequencing or repetition, so it doesn't gobble stack space.
('0'|'1'|'2'|'3'|'4'|'5'|'6')
In general, don't put in an alternate what can be put in a set or a range. They are far more efficient for this kind of thing. I'll note that obviously this isn't a valid xpressive regex. You would write it as:
(as_xpr('0')|'1'|'2'|'3'|'4'|'5'|'6')
I'm sure you know that. I'll make a similar edit for this one:
set[as_xpr('0')|'1'|'2'|'3'|'4'|'5'|'6']
Sets are fast. They are implemented as a look-up table for the ascii range and (for Unicode) a sparse vector for character codes above 256. Because of the sparseness of some sets, this scheme can take up more heap space than a simple range. But that doesn't effect the amount of stack space used when matching one. I'll note you can also write it as:
(set= '1','2','3','4','5','6')
Now for repetition:
repeat<3>(_d) >> _b
As I said before, repetition is usually implemented with recursion to get the backtracking right. But in this case, xpressive recognizes that _d
can only ever eat one character, and that it's being repeated exactly 3 times. So in this limited case, it's implemented with a loop to conserve stack space. This is in contrast to the next case:
_d >> _d >> _d >> _b
Xpressive isn't smart enough to treat _d >> _d >> _d
the same way as repeat<3>(_d)
. As a result, this will consume a separate stack frame for each _d
.
Short version: prefer specialized constructs where you can over their more general equivalents. That gives xpressive more context for optimization.
In addition, get to know independent sub-expressions. If you can put something in a keep()
, go for it. Why? keep()
matches a sub-expression, then reclaims the stack space used by that sub-match. That makes it impossible to try different things by backtracking into that sub-expression, but sometimes you don't need that.
Hope that helps.