Questions tagged [lda]

Latent Dirichlet Allocation, LDA, is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.

If observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA represents documents as mixtures of topics that spit out words with certain probabilities.

It should not be confused with Linear Discriminant Analysis, a supervised learning procedure for classifying observations into a set of categories.

1175 questions

-1

votes

1 answer

Combine Word Embeddings with with topic-word distribution from LDA for text summarization

Im a newbie in NLP and i was wondering if it is a good idea to summarize a document that has already been classified into a certain topic through methods such as LDA by considering the Word Embedding retrieved from Word2Vec and the topic-word…

asked Mar 11 '19 at 17:20

user834591230

-1

votes

1 answer

LDA: Assign more than one topic to a document

I´m new to LDA and doing some experiments with Python + LDA and some sample datasets. I already got some very interesting results and now I asked myself a question but couldn´t find an answer so far. Since I worked with customer reviews/ratings of a…

python nlp data-science lda topic-modeling

asked Dec 11 '18 at 17:56

Nicson

-1

votes

3 answers

Return None in function: TypeError: object of type 'NoneType' has no len()

I am trying to print my topics and texts from each topic in LDA. But a None after printing the topics is disrupting my script. I can print my topics but not the texts. import pandas import numpy as np from sklearn.feature_extraction.text import…

python lda nonetype

asked Aug 31 '18 at 10:21

marin

-1

votes

1 answer

Python - IndexError: list index out of range (topic modeling)

I've come across a lot of similar questions. However, the answers provided seemed not to be helpful to me. I'm trying to run a Topic Modeling analysis on an 8000'ish media articles. But I'm getting this error: Traceback (most recent call last): …

python lda topic-modeling

asked Jul 12 '17 at 08:45

M. M. Van Hulle

-1

votes

1 answer

Classification LDA vs. TFIDF

I was running Multi-label classification on text data I noticed TFIDF outperformed LDA by a large margin. TFIDF accuracy was aorund 50% and LDA was around 29%. Is this expected or should LDA do better than this?

machine-learning gensim lda text-classification

asked Dec 06 '16 at 01:52

MikeAlbert

-1

votes

2 answers

LDA python library not taking sparse matrix as input

I am trying to use the lda 1.0.2 package for python. The documentation says that sparse matrix are acceptable, but when I pass a sparse matrix to the transform() function. It throws the error The truth value of an array with more than one element…

python sparse-matrix lda

asked Jul 26 '15 at 21:37

Ishita Agrawal

-1

votes

1 answer

How can I perform LDA (latent Dirichlet allocation) on Noun Phrases in R instead of words?

I want to generate topics from my text at the level of phrases, rather than at the level of words using LDA (latent Dirichlet allocation). How can I do that in R? LDA interprets the documents as bag-of-words and produces topics with constituting…

r lda topic-modeling

asked Jun 22 '15 at 14:20

carora3

-1

votes

1 answer

How to plot log.likelihoods for each iteration in R using LDA package?

My problem is that I want to plot the log.likelihoods gathered from LDA execution in R using the LDA package. My code is: K <- 10 ## Num clusters result <- lda.collapsed.gibbs.sampler(cora.documents, K, ## Num…

r lda topic-modeling

asked May 20 '15 at 18:53

Sidahmed Mokeddem

-2

votes

1 answer

How to remove error too many values to unpack (expected 2)

Applied LDA model usinf TFIDF and then I want Performance evaluation by classifying sample document using LDA TF-IDF model. Code: for index, score in sorted(lda_model_tfidf[corpus], key=lambda tup: -1*tup[1]): print("\nScore: {}\t \nTopic:…

python lda topic-modeling

asked Jul 18 '21 at 08:35

Rajat Goyal

-2

votes

1 answer

Calculating LDA in matlab

I have written the following code: %LDA file = xlsread('LDA.xlsx'); Graph=[]; for c=1:840 for i=1:17 for j=18:34 Graph=[Graph,file(i,c),file(j,c)]; end end end lda=resubLoss(Graph) but the func resubLoss does…

matlab lda

asked Aug 27 '18 at 10:43

Yasmin

-2

votes

2 answers

Roc curve in linear discriminant analysis with R

I want to compute the Roc curve and then the AUC from the linear discriminant model. Do you know how can I do this? here there is the code: ##LDA require(MASS) library(MASS) lda.fit = lda(Negative ~., trainSparse) lda.fit plot(lda.fit) ###prediction…

r machine-learning lda roc auc

asked Jan 08 '17 at 14:33

mac gionny

-2

votes

1 answer

Different dimensions of distributions of topics

I would like to divide all documents in 10 topics, and it goes well with a converged result except for the dimensions of distributions and covariance matrix of topic. Why the topics distribution is a 9 dimension vector instead of 10 and their…

r lda topic-modeling

asked Dec 15 '16 at 15:42

Jeffy

-2

votes

2 answers

IndentationError: expected an indented block when trying to reproduce LDA for a document

I am trying to obtain the LDA distribution among the first article of my collection but I am running into several errors: my collection: doc_set, is a pandas.core.series.Series. Whenever I wanted to run the simple…

python pandas lda

asked May 26 '16 at 09:59

Economist_Ayahuasca

1,648
24
33

-2

votes

2 answers

bag-of-words approach / tools / library for C++?

I have a folder that contains many document in .txt of tourism reviews. I want to use the bag of words approach to convert them to some kind of numeric representation for machine learning (Latent Dirichlet Allocation - LDA) in c++ to train the…

c++ machine-learning text-processing text-extraction lda

asked May 19 '15 at 14:28

Indiastradi

-2

votes

2 answers

Non-GPL Open Source Latent Dirichlet Allocation Implementation/Library in C/C++

I know some implementations (mainly from this question) but they seemed to be all published unter GPL. Are there any (platform independent) implementations without the GPL restrictions?

c++ c machine-learning nlp lda

asked Jul 02 '12 at 13:21

snøreven

1,904
2
19
39

Prev 1 2 3

…

79 Next