I would to like to match the word after a -
in my text then if that matched word is the end of another word then I would like to do a split between the word and the matched word.
Example of the text:
JOHN LION - XYZ RAN RUN TREEABC GRASS - ABC LIMB RAN RUN LION -XYZ JOG SUN
SKY - ABC LION JOHN PONDABC RUN - PDF STONE
what I would like the text to look like:
JOHN LION - XYZ RAN RUN TREE ABC GRASS - ABC LIMB RAN RUN LION -XYZ JOG SUN
SKY - ABC LION JOHN POND ABC RUN - PDF STONE
I do not not want to do a grepl
and a gsub
on ABC
because the word after the dash is always changing and will appear multiple times. Also the word that is in front of the matched word will also always be different and will not always be TREE
. No matter what the word is in front of the matched word I always want to do a split.
If I do the following str_extract:
str_extract(df, "(?<=-\\s)\\w+")
Then I match XYZ
not ABC
.
I only want to match the word after the -
if it is also at the end of another word, but again I do not know what that other word will be.
I am stuck as what to do. Please let me know if any further information is needed. Any help will be greatly appreciated.