For the string "1-2-3-4"
, I wanted to create a regex, that would give me the following matches, i.e. each matching pair of digits separated by 1 or more dashes:
"1-2"
"2-3"
"3-4"
with each digit in its own capture group.
First attempt (c# flavour):
(?<first>\d)-+(?<second>\d)
This gives me:
"1-2"
"3-4"
Obviously, at the point of getting the first match, I have consumed the digit "2"
and so the next char to match from is the dash after "2"
.
Then I ended up reading up on my c#-/Regex-skills and stumbled upon Balancing Groups, enter my stubbornness. As I understood, this should do it (but it doesn't):
(?<entire>(?:(?<first-entire>\k<entire>)|(?<first>\d))-+(?<second>\d))
This yields the same result as my first attempt. I would expect the <first-entire>
thing would rewind the captures to previous match (if any), making the \k<entire>
part now match the entire previous match (which after 1 iteration would be "1-2"
), or - if first iteration - fall back to the <first>\d
pattern.
What have I misunderstood?
Update: Probably should have explained exactly what I was aiming to do - but hinted by commenter, found the solution to my aim, which was to remove all dashes (1 or more) that might occur between digits. Solved with a simpler positive look-ahead like:
Regex _stripTheDashes = new Regex(@"(?<digit>\d)-+(?=\d)", RegexOptions.Compiled);
var stripped = _stripTheDashes.Replace(s, m => m.Groups["digit"].Value);
Will leave this as-is, since it's been closed as a duplicate. And apparently I was wrong about using balancing groups :)