I'm currently using RStudio for doing text mining on Support tickets, clustering them by their description (freetext). For this, I compare kmeans to EM algorithm. I prepared the data with the tm package, and now I try do apply clustering algorithms to the data matrix.
With the kmeans() function, I can use following Code snippet to Output the 5 most frequent Terms in text Clusters (kmeans21):
> for (i in 1:num_cluster) {
cat(paste("cluster ", i, ": ", sep = ""))
s <- sort(kmeans21$centers[i, ], decreasing = T)
cat(names(s)[1:5], "\n")
}
Until now, I couldnt find a function to do the same within the mclust package. My data has the following Format:
> bic21 <- MclustBIC(m1, G=21)
> emmodel21 <- summary(bic21, data = m1)
With the command
> emmodel21$classification
I can see the Cluster for each supportticket, but is there also the possibility to Output the most frequent Terms like in the first Code block for kmeans?