I am very new to R and I am trying to do an NGram WordCloud. However, my results always show a 1Gram instead of an NGram. I have searched for days for answers on the web and tried different methods...still the same result. Also, for some reason, I don't have the Ngramtokenizer function that I see everyone is using. However, I found another tokenizer function that I am using here. I hope someone can help me out. Thanks in advance!
library(dplyr)
library(ggplot2)
library(tidytext)
library(wordcloud)
library(tm)
library(RTextTools)
library(readxl)
library(qdap)
library(RWeka)
library(tau)
library(quanteda)
rm(list = ls())
#setwd("C:\\RStatistics\\Data\\")
#allverbatims <-read_excel("RS_Verbatims2018.xlsx") #reads excel files
#selgroup <- subset(allverbatims, FastNPS=="Detractors")
#selcolumns <- selgroup[ ,3:8]
#sample data
selcolumns <- c("this is a test","my test is not working","sample data here")
Comments <- Corpus(VectorSource(selcolumns))
CommentClean <- tm_map(Comments, removePunctuation)
CommentClean <- tm_map(CommentClean, content_transformer(tolower))
CommentClean <- tm_map(CommentClean,removeNumbers)
CommentClean <- tm_map(CommentClean, stripWhitespace)
CommentClean <- tm_map(CommentClean,removeWords,c(stopwords('english')))
#create manual tokenizer using tau textcnt since NGramTokenizer is not available
tokenize_ngrams <- function(x, n=2) return(rownames(as.data.frame(unclass(textcnt(x,method="string", n=n)))))
#test tokenizer
head(tokenize_ngrams(CommentClean))
td_mat <- TermDocumentMatrix(CommentClean, control = list(tokenize = tokenize_ngrams))
inspect(td_mat) #should be bigrams but the result is 1 gram
matrix <- as.matrix(td_mat)
sorted <- sort(rowSums(matrix),decreasing = TRUE)
data_text <- data.frame(word = names(sorted),freq = sorted)
set.seed(1234)
wordcloud(word = data_text$word, freq = data_text$freq, min = 5, max.words = 100, random.order = FALSE, rot.per = 0.1, colors = rainbow(30))