I was hoping for some help with extracting the last N words from a column in a data.table.. and then assigning it to a new column.
test <- data.table(original = c('the green shirt totally brings out your eyes'
, 'ford focus hatchback'))
The original data.table looks like this:
original
1: the green shirt totally brings out your eyes
2: ford focus hatchback
I want to subset out (up to) the last 5 words into a new column, so the output looks like:
original extracted
1: the green shirt totally brings out your eyes totally brings out your eyes
2: ford focus hatchback ford focus hatchback
I tried:
test <- test[, extracted := paste0(tail(strsplit(original, ' ')[[1]], 5)
, collapse = ' ')]
and it almost works, except that the 1st value in the 'extracted' column is repeated throughout the new column:
original extracted
1: the green shirt totally brings out your eyes totally brings out your eyes
2: ford focus hatchback totally brings out your eyes
For the life of me I can't figure this out. I tried the 'word' function from 'stringr' which gives me the last word, but I can't seem to count backwards.
Any help would be greatly appreciated!