I'm trying to add a new column keywords
that will get the value TRUE
if the word occurs in a list of keywords. The value will be FALSE
if the word doesn't occur in the keywordslist
. My keywords consists of more than 100 words, so manually adding the words is not an option.
keywordlist(sample):
thank
impressed
this
I have a dataframe with the values id
and word
, I have unnested the words and grouped by id:
id word
1234 thank
1234 you
1234 very
1234 much
1567 i
1567 am
1567 not
1567 impressed
9654 what
9654 is
9654 this
I would like the result to look like this:
id word keywords
1234 thank TRUE
1234 you FALSE
1234 very FALSE
1234 much FALSE
1567 i FALSE
1567 am FALSE
1567 not FALSE
1567 impressed TRUE
9654 what FALSE
9654 is FALSE
9654 this TRUE
The codes that I have tried is as followed: 1. :
df <- df %>%
group_by(id) %>%
mutate(keywords = ifelse(
word == rowwise(keywordslist), TRUE, FALSE)
code #1 raises the next error:
Error in mutate_impl(.data, dots) : Evaluation error: is.data.frame(data) is not TRUE.
I have tried a little different variant with grepl:
df <- df %>% group_by(id) %>% mutate(keywords = ifelse( word == rowwise(grepl(keywordslist, word)), TRUE,FALSE)
This raised the following error:
Error in mutate_impl(.data, dots) : Evaluation error: is.data.frame(data) is not TRUE. In addition: Warning message: In grepl(keywordslist, keywords) : argument 'pattern' has length > 1 and only the first element will be used
I'm not sure if this is the correct way to approach this situation anymore. Any help is welcome.