I want to extract the country from a vector with strings like 'M_South_Africa_5_14' and 'P_Zimbabwe_Tot'. I have been trying unsuccessfully to do it with a single grep or stringr::str.extract statement. Of course, I could break the string by '_' and then collect the pieces, but would it be possible to do it with a regular expression?
grep(value = TRUE,
x = 'M_South_Africa_5_14',
pattern = '(?!^[PMF]{1})(?![_])([A-Za-z]{2,20})[_][A-Za-z]{2,20}(?!$)|(?!^[PMF]{1})(?![_])([A-Za-z]{2,20})',
perl = TRUE)
It would be great to simplify this regex monster, of course, but I actually want to know if I can possibly run regex lookups on R.