My dataframe has a variety of strings. See sample df:
strings <- c("Average complications and higher payment",
"Average complications and average payment",
"Average complications and lower payment",
"Average mortality and higher payment",
"Better mortality and average payment")
df <- data.frame(strings, stringsAsFactors = F)
I'd like to isolate the first word in the sentence and the second-to-last. The second-to-last will always precede the word "payment."
Here's what my desired df would look like:
strings <- c("Average complications and higher payment",
"Average complications and average payment",
"Average complications and lower payment",
"Average mortality and higher payment",
"Better mortality and average payment")
QualityWord <- c("Average","Average","Average","Average","Better")
PaymentWord <- c("Higher","Average","Lower","Higher","Average")
desireddf <- data.frame(strings, QualityWord, PaymentWord, stringsAsFactors = F)
The resulting strings don't need to be case sensitive.
I'm able to write code to get the first word in a sentence (split at the space) but can't figure out how to pull a word to the left (or right, for that matter) of a reference word, which is "payment" in this case.