I am using stopwords dataset in tidytext
package in R
to remove stopwords. I am using following code:
library(tidyverse)
library(tidytext)
library(dplyr)
data(stop_words)
example_words <- c("the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog","i'm","don’t","it’s","i’ve")
filtered_words <- example_words[!example_words %in% stop_words$word]
filtered_words
The final output is as follows:
> filtered_words
[1] "quick" "brown" "fox" "jumps" "lazy" "dog" "don’t" "it’s" "i’ve"
We can see the stop words like "don’t" "it’s" "i’ve" still presented in the filtered output. But those stop words are actually presented in the stop word dataset and somehow not get removed. So could anyone help me to figure out why is it not removing some of these words that are presented in the stop words dataset?