I'm trying to extract TV show name from txt file using R.
I have loaded the txt and assigned it to a variable called txt. Now I'm trying to use regular expression to extract just the information I want.
The pattern of information I want to extract is likes of
SHOW: Game of Thrones 7:00 PM EST
SHOW: The Outsider 3:00 PM EST
SHOW: Don't Be a Menace to South Central While Drinking Your Juice In The Hood 10:00 AM EST
and so on. There are about 320 shows and I want to extract all 320 of them.
So far, I've come up with this.
pattern <- "SHOW:\\s\\w*"
str_extract_all(txt, pattern3)
However, it doesn't extract the entire title name like I intended. (ex: it will extract "SHOW: Game" instead of "SHOW: Game of Thrones". If I were to extract that one show, I would just use "SHOW:\\s\\w*\\s\\w*\\s\\w*
to match the word count, but I want to extract all shows in txt, including the longer and shorter names.
How should I write the regular expression to get the intended result?