Does anybody know if there is a way to classify new text data into topics using R package mallet
?
The general routine for this package is:
mallet.instances <- mallet.import(as.character(data$id),
as.character(data$text),
"Documents/Projects/tm/stopwords.txt")
topic.model <- MalletLDA(num.topics=10)
topic.model$loadDocuments(mallet.instances)
topic.model$setAlphaOptimization(20, 100) # optimise parameters after every 20 iterations which will be preceeded by 100 burnin
topic.model$train(1000) # train the model
topic.model$maximize(10) # pick the best topic for each token
But I could not find anywhere a way to classify new data using a pre-trained model. The alternatives would be to either use the topicmodels
package or to run Mallet through the command line. Both options are reasonable (although I must say that I tend to get much more convincing results using Mallet that topicmodels), however if I have already trained a model using the R package mallet
and I do not want to change the topics, finding a way to classify data using mallet
package would be quite helpful.