2

I would like to get an array of all captured group matches in chronological order (the order they appear in in the input string).

So for examples with the following regex:

(?P<fooGroup>foo)|(?P<barGroup>bar)

and the following input:

foo bar foo

I would like to get something that resembles the following output:

[("fooGroup", (0,3)), ("barGroup", (4,7)), ("fooGroup", (8,11))]

Is this possible to do without manually sorting all matches?

Joakim Danielson
  • 43,251
  • 5
  • 22
  • 52
Jomy
  • 514
  • 6
  • 22
  • Please don't ask for an answer in any of several languages and don't tag the question with multiple languages either. I assume the answer you have gotten is written in Rust so I left that tag – Joakim Danielson Mar 25 '22 at 17:38
  • @JoakimDanielson Sorry about that, I'll think about it next time – Jomy Mar 25 '22 at 19:14

1 Answers1

2

I don't know what you mean by "without manually sorting all matches," but this Rust code produces the output you want for this particular style of pattern:

use regex::Regex;

fn main() {
    let pattern = r"(?P<fooGroup>foo)|(?P<barGroup>bar)";
    let haystack = "foo bar foo";
    let mut matches: Vec<(String, (usize, usize))> = vec![];

    let re = Regex::new(pattern).unwrap();
    // We skip the first capture group, which always corresponds
    // to the entire pattern and is unnamed. Otherwise, we assume
    // every capturing group has a name and corresponds to a single
    // alternation in the regex.
    let group_names: Vec<&str> =
        re.capture_names().skip(1).map(|x| x.unwrap()).collect();
    for caps in re.captures_iter(haystack) {
        for name in &group_names {
            if let Some(m) = caps.name(name) {
                matches.push((name.to_string(), (m.start(), m.end())));
            }
        }
    }

    println!("{:?}", matches);
}

The only real trick here is to make sure group_names is correct. It's correct for any pattern of the form (?P<name1>re1)|(?P<name2>re2)|...|(?P<nameN>reN) where each reI contains no other capturing groups.

BurntSushi5
  • 13,917
  • 7
  • 52
  • 45