How to get Captured Part of the Regular Expression in Nim

Question

I want to extract "some_token" from the "some text :some_token" text.

The code below returns the full match ' :some_token' not the captured part 'some_token' marked with ([a-z0-9_-]+).

import re

let expr = re("\\s:([a-z0-9_-]+)$", flags = {re_study, re_ignore_case})
for match in "some text :some_token".find_bounds(expr):
  echo "'" & match & "'"

How it could be modified to return only the captured part?

P.S.

Also, what's the difference between re and nre modules?

pietroppeter · Accepted Answer · 2020-09-24T09:15:54.053

The submitted code does not compile (find_bounds returns a tuple[first, last: int] and not something that you can iterate with for). Still, it is true that find_bounds in that examples will give index bounds of the whole pattern and not the capture substring.

The following (https://play.nim-lang.org/#ix=2yvs) works to give the captured string:

import re

let expr = re("\\s:([a-z0-9_-]+)$", flags = {re_study, re_ignore_case})
var matches: array[1, string]
if "some text :some_token".find(expr, matches) >= 0:
  echo matches  # -> ["some_token"]

Note that in the above matches must have the correct length for captured groups (using a sequence will not work unless you specify the correct length). This is a known issue of re: https://github.com/nim-lang/Nim/issues/9472

Regarding the dual existence of re and nre, summarizing from this discussion:

nre has a different api (more ergonomic) than re (closer to C API)
nre had less issues than re in the past, but the gap has been closed in recent times (see also open regex issues)
it might be that in the future nre might be moved out from stdlib and put in a nimble package, but since this has not happened in v1, it probably will not happen before v2
note that there is a pure nim implementation of regex (nim-regex) which also has an ergonomic API.

How to get Captured Part of the Regular Expression in Nim

1 Answers1