Questions tagged [topicmodels]

topicmodels is an R package implementing Latent Dirichlet Allocation topic modeling.

Excerpt from topicmodels page on CRAN:

Provides an interface to the C code for Latent Dirichlet Allocation (LDA) models and Correlated Topics Models (CTM) by David M. Blei and co-authors and the C++ code for fitting LDA models using Gibbs sampling by Xuan-Hieu Phan and co-authors.

101 questions
0
votes
0 answers

LDA topic modeling in R : error : Each row of the input matrix needs to contain at least one non-zero entry

news_nov_lda<-LDA(news_nov_dtm2, k=18, seed=1234) Error in LDA(news_nov_dtm2, k = 18, seed = 1234) : Each row of the input matrix needs to contain at least one non-zero entry So I tried raw.sum=apply(news_nov_dtm2, 1, FUN =…
0
votes
0 answers

topicmodels has inverted functions $topics and $terms. Is it reliable?

I have a vector of strings (which represent preprocessed documents) on which I want to estimate an LDA model through R. I use functions in the topicmodels library. For the purpose of making reproduction of the problem easy, I create a vector with…
Thomas GF
  • 1
  • 2
0
votes
1 answer

Convert processed format with stm into dtm (Structural topic modeling)

I have used the textProcessor and the prepDocuments functions from the stm package to clean a corpus. Now I would like to convert the resulting object (list of indices plus vocabulary) into a standard document-term matrix (or quanteda…
Dario Lacan
  • 1,099
  • 1
  • 11
  • 25
0
votes
0 answers

How to automatically assign human readable labels to a topic?

Hi! I ran a cluster analysis on my structural topic model and it resulted in 98 topics. How can I either a) manually assign topic names for each topic such that topic 98 could be called "teams" for example, or b) automatically assign topic names…
md_14
  • 125
  • 1
  • 2
  • 9
0
votes
1 answer

Errors running Oolong validation in R on both STM and seededLDA

I'm trying to run the oolong package to validate a couple of topic models I've created. Using both an STM model and a seededLDA model (this code won't be reproducible) oolong_test1a <- witi(input_model = model_stm_byt, input_corpus =…
0
votes
1 answer

Remove Backslash in a word in R

I have been trying to do topic modeling for articles. I cleaned the raw data which contains a lot of backslash and numbers. Even after removing the punctuations, backslash, and numbers, but I got the backslash along with numbers in top terms in…
0
votes
1 answer

Error in LDA(cdes, k = K, method = "Gibbs", control = list(verbose = 25L, : Each row of the input matrix needs to contain at least one non-zero entry

I have a big dataset of almost 90 columns and about 200k observations. One of the column contains descriptions, so it's only text. However, i have like 100 descriptions that are NAs. I tried the code of Pablo Barbera from GitHub concerning Topic…
katdataecon
  • 185
  • 8
0
votes
0 answers

Error message on R : "data set 'X' has not been found" when trying to do topic modeling although I have already used that data for other techniques

I am doing a lyrical analysis of Paramore's discography using data from GeniusAPI. I have done most my analysis after going through data wrangling. I was able to create word clouds and bar charts based on sentiment analysis for each album. But now I…
0
votes
1 answer

Can you print more than 11 covariates for summary.estimateEffect?

I have created an stm topic model and I have issues with summary.estimateEffect, I have around 150 days, yet, it only prints 10 days for regression estimates. parlPrevFit<- stm(document = out$documents, vocab = out$vocab, K = 0, prevalence…
0
votes
1 answer

How to use the seededlda package in R to retain identification of users for topics

I have been trying to do topic modeling on a collection of discussion forum posts in a MOOC. I have tried basic LDA to create topics, and the topics were meaningless. So now I'm looking into seeding my topics to create better topics. I found the…
Heather_B
  • 11
  • 2
0
votes
0 answers

How to apply (semi-)supervised methods (structural topic models, seeded lda) to corpora with only one topic and aggregate their results per year in r?

I am currently working on a project in which I am interested in the prevalence of one topic (social inequality) in German plenary debates and newspaper articles. I am using quantitative text analysis tools in order to generate outputs from texts to…
0
votes
1 answer

How do you combine multiple documents into a single document with topicmodels in r?

I am currently trying to combine multiple documents of a corpus into a single document using the topicmodels package. I initially imported my data through multiple csvs, each with multiple lines of text. When I import each csv, however, each line of…
morivera
  • 3
  • 3
0
votes
2 answers

Can I manually reorder a LDA_Gibbs topicmodel

I have a LDA_Gibbs topicmodel, from the topicmodels library. I also have a LDAvis interactive visualisation. My issue is; the topics are not in the same order in the LDA object and in LDAvis. I'd like to get one to map to the other (don't care…
0
votes
0 answers

How to Do Topic Modelling and Classification on Each Sentence Comment in a Data Frame in R?

Is there a way to do topic modelling and classification on a data frame of comments in R? I have 10 columns of comments (where each comment is a open ended sentence of a topic related to a question) and I want to classify each of these comments by…
Dew Man
  • 35
  • 2
  • 5
0
votes
1 answer

How to create a dtm without losing rows

I try to run an lda. I have to convert it to an appropriate format using this However with this, I don't know why I lose 2-3 documents from my initial input. dtm <- convert(myDfm, to = "topicmodels") As a result I can merge the topic with the…
Nathalie
  • 1,228
  • 7
  • 20