Questions tagged [topicmodels]

topicmodels is an R package implementing Latent Dirichlet Allocation topic modeling.

Excerpt from topicmodels page on CRAN:

Provides an interface to the C code for Latent Dirichlet Allocation (LDA) models and Correlated Topics Models (CTM) by David M. Blei and co-authors and the C++ code for fitting LDA models using Gibbs sampling by Xuan-Hieu Phan and co-authors.

101 questions
0
votes
2 answers

Re-label the topic number in STM

For presentation, I would like to re-label the topic number of STM topic modeling (e.g., change 'topic 40' into 'topic 1'). I am however not sure where I should change (where are the topic numbers stored?).
user7453767
  • 339
  • 2
  • 14
0
votes
2 answers

tidy from broom not finding method for LDA from topicmodels

Running this script, straight from 'Text mining with R', library(topicmodels) library(broom) data("AssociatedPress") ap_lda <- LDA(AssociatedPress, k = 2, control = list(seed = 1234)) tidy(ap_lda) I get this error message: Error in…
Isaiah
  • 53
  • 7
0
votes
1 answer

LDA with topicmodels (R), how can I see which topics different documents belong to, with document titles preserved?

I appreciate the answer from Ben here: LDA with topicmodels, how can I see which topics different documents belong to? My question is: How do I preserve the document titles in the last step? For example: Manually create three .txt documents in…
Tyler
  • 3
  • 1
  • 5
0
votes
1 answer

Why am I getting an error in 1:nrow(counts) : argument of length 0

I am doing topic modelling using the topicmodels package in R. I am creating a Corpus object, doing some basic preprocessing, and then creating a DocumentTermMatrix: library(topicmodels) #Set parameters for Gibbs sampling burnin <- 4000 iter <-…
Hani Ihlayyle
  • 135
  • 3
  • 12
0
votes
0 answers

R - Incorporating a precoded training set into lda model

I am attempting to assign a list of surveyed questions into 30 different categories using the LDA function in the topicmodels package. The code I have so far is: source <- VectorSource(openended$q2) corpus <- Corpus(source) corpus <- tm_map(corpus,…
DBH
  • 9
  • 3
0
votes
0 answers

undefined symbol: gsl_multimin_fdfminimizer_conjugate_fr when trying to install topicmodels in R

I've been trying to install the topicmodels package in R for quite a few minutes. I've read many tutorials and installed the gsl package, but I'm still getting this error: Error: package or namespace load failed for ‘topicmodels’ in dyn.load(file,…
iatowks
  • 970
  • 2
  • 9
  • 21
0
votes
1 answer

Adjacent topics graphs

I am trying to plot the Network of the Word Distributions Over Topics (Topic Relation). using this code [source]: post <- topicmodels::posterior(ldaOut) cor_mat <- cor(t(post[["terms"]])) cor_mat[ cor_mat < .05 ] <- 0 diag(cor_mat) <- 0 graph <-…
Sultan
  • 189
  • 2
  • 9
0
votes
1 answer

tidy Error in eval(substitute(expr), envir, enclos) : binding not found: 'Var1'

When I apply the tidy function to the result of the LDA model in my dataset, I get the following error "Error in eval(substitute(expr), envir, enclos) : binding not found: 'Var1'". I get the same error when used on associated press example, as shown…
0
votes
1 answer

Topic Modelling

I have an excel sheet with 6000 records each representing a message which I want to give it a topic for example is it related to sports or news and so on and I want to figure it out from the words inside the sentence.I want an easy program with a…
0
votes
1 answer

R topicmodels tidytext - Latent Dirchelet Allocation (LDA) : Error: binding not found: 'Var1'

I'm having issues with my LDA model in R. Everytime I try to execute the tidy() function on my LDA_VEM object I get the error "Error: binding not found: 'Var1'. Could you please explain how to remedy this my code is below: why…
0
votes
0 answers

ggplot2 does not currently support free scales with a non-cartesian coord or coord_flip

With the results of LDA topic model, I am trying to create 30 horizontal bar charts to show top words vs their probabilities. png("airport.png") top_terms %>% mutate(term = reorder(term, beta)) %>% ggplot(aes(term, beta, fill = factor(topic)))…
kevin
  • 1,914
  • 4
  • 25
  • 30
0
votes
1 answer

How to specify link distance between nodes in Cytoscape.js?

I am new to Cytoscape.js, so I may be missing something obvious... I know how to do this in D3.js, but need more power to display clustering of a large number of nodes (> 1,000) and don't need to visualize the links. Thanks in advance to pointing me…
Owen
  • 21
  • 1
0
votes
1 answer

Is it possible to find the posterior probability of topics generated with LDAvis occurring in a given document? How, if so?

As may or may not be evident from the question, I'm pretty new to R and I could do with a bit of help on this. When creating topic models, I've experimented with LDA and LDAvis - code in (A) and (B) below. LDA in (A) allows me to find the posterior…
Gazzer
  • 1
  • 1
0
votes
0 answers

Duplicate document names when processing topic modelling results in R

I am working on creating topic models based on Tweets in R using the topicmodels package. I want to create a dataframe containing all the results from the topic model so that I can insert it into a database. This is how I do it: # create dataframe…
Roska
  • 11
  • 4
0
votes
1 answer

Seeding words into an LDA topic model in R

I have a dataset of news articles that have been collected based on the criteria that they use the term "euroscepticism" or "eurosceptic". I have been running topic models using the lda package (with dfm matrices built in quanteda) in order to…