0

My desire is to know the length of a certain text separated by ; which comes after any number. In the text named txt below, I don't want to consider the first two semicolons. To get the length, the ; comes after 6, 5 should be considered. I mean the code should lookbehind some number(s) to consider the appropriate ;.

library(stringr)
txt <- "A;B; dd (2020) text  pp. 805-806; Mining; exercise (1999), ee, p-123-125; F;G;H text, (2017) kk"

lenghths(strsplit(txt,";")) gives me 8. In my case, however, it should be 3. Any help is highly appreciated.

iGada
  • 599
  • 3
  • 9

1 Answers1

1

We can use a regex lookaround to match a ; that succeeds a digit ((?<=[0-9])) and get the lengths

lengths(strsplit(txt, "(?<=[5-6]);", perl = TRUE))
#[1] 3

Or using str_count

library(stringr)
str_count(txt, '[5-6];') + 1
#[1] 3
akrun
  • 874,273
  • 37
  • 540
  • 662