I am looking to create several sub-expressions in a larger regular expression, where each subexpression matches something at one place in the input or another place, but not in both places, preferably using the same named group per "area of interest". For example, I'd like to match volume units in italics below, and currency units, shown in bold.
- $3.23 USD / gal.
- USD 3.23 in gallons
- 4.50 CAD / gal
- 1 gal @ USD 3.23
- 10 gal. @ $4.50 CAD
Or more generally:
- stuffmorestuffXXXyetmorestuff
- stuffXXXmorestuff
where stuff and morestuff could be a complex set of sub-expressions.
It seems like it might be possible using some combination of
- group stack push/pop
- balancing groups
- look-around
but I'm not sure how to proceed. Does it come down to alternations (|
) or multiple passes with different expressions (which I suppose amounts to the same thing)?