I am new to R programming and want to try extracting alphanumeric words AND words containing more than 1 uppercase.
Below is an example of the string and my desired output for it.
x <- c("123AB123 Electrical CDe FG123-4 ...",
"12/1/17 ABCD How are you today A123B",
"20.9.12 Eat / Drink XY1234 for PQRS1",
"Going home H123a1 ab-cd1",
"Change channel for al1234 to al5678")
#Desired Output
#[1] "123AB123 CDe FG123-4" "ABCD A123B" "XY1234 PQRS"
#[2] "H123a1 ab-cd1" "al1234 al5678"
I have come across 2 separate solutions so far on Stack Overflow:
- Extracts all words that contain a number --> Not helpful to me because the column I'm applying the function to contains many date strings; "12/1/17 ABCD How are you today A123B"
- Identify strings that have more than one caps/uppercase --> Pierre Lafortune has provided the following solution:
how-to-count-capslock-in-string-using-r
library(stringr)
str_count(x, "\\b[A-Z]{2,}\\b")
His code provides the number of times a string has more than 1 uppercase but I want to extract those words in addition to extracting alphanumeric words too.
Forgive me if my question or research is not comprehensive enough. I will post my researched solution for extracting all words containing a number in 12 hours when i have access to my work station which contains R and the dataset.