Questions tagged [topicmodels]

topicmodels is an R package implementing Latent Dirichlet Allocation topic modeling.

Excerpt from topicmodels page on CRAN:

Provides an interface to the C code for Latent Dirichlet Allocation (LDA) models and Correlated Topics Models (CTM) by David M. Blei and co-authors and the C++ code for fitting LDA models using Gibbs sampling by Xuan-Hieu Phan and co-authors.

101 questions
0
votes
1 answer

LDA with topicmodels package for R, how do I get the topic probability for each term?

I'm using the topicmodels package for LDA. I would like to create a visualization that shows how related or non-related each topic is. I envision a cluster of words that are unique to topic 1, but with a few keywords that are shared connecting to…
lmcshane
  • 1,074
  • 4
  • 14
  • 27
0
votes
1 answer

R in Windows cannot handle some characters

I performed LDA in Linux and didn't get characters like "ø" in topic 2. However, when run in Windows, they show. Does anyone know how to deal with this? I used packages quanteda and topicmodels. > terms(LDAModel1,5) Topic 1 Topic 2 [1,] "car" …
user1569341
  • 333
  • 1
  • 6
  • 17
0
votes
0 answers

how to get a probability distribution for a topic in mallet?

Using mallet I can get a specific number of topics and their words. How can I make sure topic words make a probability distribution (ie sum to one)? For example if I run it as bellow, how can I use the outputs given by mallet to make sure…
samsamara
  • 4,630
  • 7
  • 36
  • 66
0
votes
1 answer

Predicting topics with LDA

I am trying to extract topic assignments from a fit I build with R's 'lda' package. I created a fit: fit <- lda.collapsed.gibbs.sampler(documents = documents, K = K, vocab = vocab, num.iterations = G, alpha = alpha, eta = eta, initial = NULL, …
Sylvia
  • 315
  • 2
  • 17
0
votes
0 answers

DocumentTermMatrix() return 0 terms in tm package

I have an object like that: str(apps) chr [1:17517] "35 44 33 40 33 40 44 38 33 37 37" ... In each row, the number is separated by space. corpus<-Corpus(VectorSource(apps)) dtm<-DocumentTermMatrix(corpus) str(dtm) List of 6 $ i : int(0) $…
ysfseu
  • 666
  • 1
  • 10
  • 20
0
votes
1 answer

Different results of LDA using R(topicmodels)

I am using R topicmodels to train an LDA model from a small corpus, but I find that every time I repeat the same code, it has the different results (different topics and different topic terms) My question is why the same condition and same corpus…
Snow
  • 1
  • 3
0
votes
1 answer

Manually Specifying a Topic Model in R

I have a corpus of text with each line in the csv file uniquely specifying a "topic" I am interested in. If I were to run an topic model on this corpus using an LDA or Gibbs method from either the topicmodels package or lda, as expected I would get…
william
  • 1
  • 1
0
votes
0 answers

which.max(sapply, train_gibbs, logLik) error

So, I am following Grun and Hornik (http://www.jstatsoft.org/v40/i13/) method of 10 fold cross validation by calculating perplexity from 10-fold training and test set. But I have error when I create test_gibbs which is stated the end of the code…
user37874
  • 415
  • 1
  • 5
  • 11
0
votes
1 answer

document-topic probability after training topic models using "topicmodels" in R: gamma or posterior()?

Below is what I get after training 3328 text files using gibbs sampling. I need to save a file that contains document_topic probability. Is gamma the document-topic probability? But most of the numbers are smoothed and not very informative in terms…
user37874
  • 415
  • 1
  • 5
  • 11
0
votes
2 answers

How do you normalize the rows of a document term matrix in place in R?

I have a DocumentTermMatrix named train_dtm and I want to normalize the frequency counts of the term frequencies in all the documents. The problem I am facing is that the resulting matrix should also be of type DocumentTermMatrix because I want to…
London guy
  • 27,522
  • 44
  • 121
  • 179
-1
votes
1 answer

R: topicmodels, 2 similar documents, code works with one, doesnt with the other

I have a quite strange error occuring when i run my topicmodel code. Basically I have a .csv file with user comments. I want to create a dtm with each comment being one document. i took a sample of 8k comments and used the following code on it: >…
Andres
  • 1
  • 1
1 2 3 4 5 6
7