Very new to R and coding, and trying to do a frequency analysis on a long list of sentences and their given weighting. I've un-nested and mutated the data, but when I try to remove stop words, the sort order of words within each sentence gets randomized. I need to create bigrams later on, and would prefer if they're based on the original phrase.
Here's the relevant code, can provide more if insufficient:
library(dplyr)
library(tidytext)
data = data%>%
anti_join(stop_words)%>%
filter(!is.na(word))
What can I do to retain the original sort order within each sentence? I have all the words in a sentence indexed so I can match them to their given weight. Is there a better way to remove stop words that doesn't mess up the sort order?
Saw a similar question here but it's unresolved: How to stop anti_join from reversing sort order in R?
Also tried this but didn't work: dplyr How to sort groups within sorted groups?
Got help from a colleague in writing this but unfortunately they're not available anymore so any insight will be helpful. Thanks!