Matching character followed by exactly 1 digit

Question

I need to align formatting of some clinical trial IDs two merge two databases. For example, in database A patient 123 visit 1 is stored as '123v01' and in database B just '123v1'

I can match A to B by grep match those containing 'v0' and strip out the trailing zero to just 'v', but for academic interest & expanding R / regex skills, I want to reverse match B to A by matching only those containing 'v' followed by only 1 digit, so I can then separately pad that digit with a leading zero.

For a reprex:

string <- c("123v1", "123v01", "123v001")

I can match those with >= 2 digits following a 'v', then inverse subset

> idx <- grepl("v(\\d{2})", string)
> string[!idx]
[1] "123v1"

But there must be a way to match 'v' followed by just a single digit only? I have tried the lookarounds

# Negative look ahead "v not followed by 2+ digits"
grepl("v(?!\\d{2})", string)

# Positive look behind "single digit following v"
grepl("(?<=v)\\d{1})", string)

But both return an 'invalid regex' error

Any suggestions?

I'm not much good at regex, but I suggest `[vV][0-9]{1}[!0-9]` — Agi Hammerthief, Aug 19 '19 at 15:08
Or even shorter, as `\\d` is one digit: `grepl("v\\d$", string)`, where `$` indicates end of string. But maybe its better to remove all leading zeros e.g. with `sub("v0*", "v", string)` and then make the match. — GKi, Aug 19 '19 at 16:19
Mind that `v(?!\d{2})` matches `vWORD_HERE` - i.e. even when no digit is there after `v`. See [my answer](https://stackoverflow.com/a/57559945/3832970) with the proper solution. — Wiktor Stribiżew, Aug 19 '19 at 19:35

score 3 · Answer 1 · answered Aug 19 '19 at 15:25

3

You need to set the perl=TRUE flag on your grepl function.

e.g.

grepl("v(?!\\d{2})", string, perl=TRUE)
[1]  TRUE FALSE FALSE

See this question for more info.

answered Aug 19 '19 at 15:25

meenaparam

1,949
2
17
29

`v(?!\d{2})` matches `vWORD_HERE` - i.e. even when no digit is there after `v`. See [your regex demo](https://regex101.com/r/G8F2iN/1) with an unexpected match. – Wiktor Stribiżew Aug 21 '19 at 07:14
Ah thanks, I didn't check whether the OP's regex's solved their problem, I just addressed the "invalid regex" error the OP mentioned. – meenaparam Aug 21 '19 at 10:50
Yes, but it does not solve the issue. Just using `perl=TRUE` does not help. – Wiktor Stribiżew Aug 21 '19 at 11:02

score 1 · Accepted Answer · answered Aug 19 '19 at 15:39

You may use

grepl("v\\d(?!\\d)", string, perl=TRUE)

The v\d(?!\d) pattern matches v, 1 digits and then makes sure there is no digit immediately to the right of the current location (i.e. after the v + 1 digit).

See the regex demo.

Note that you need to enable PCRE regex flavor with the perl=TRUE argument.

Matching character followed by exactly 1 digit

2 Answers2