Questions tagged [lda]

Latent Dirichlet Allocation, LDA, is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.

If observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA represents documents as mixtures of topics that spit out words with certain probabilities.

It should not be confused with Linear Discriminant Analysis, a supervised learning procedure for classifying observations into a set of categories.

1175 questions
0
votes
1 answer

Mallet Api - Get consistent results

I am new to LDA and mallet. I have the following query I tried running Mallet-LDA with the command line and by setting the --random-seed to a fixed value, I was able to get consistent results for multiple runs of the algorithm However, I did try…
Uno
  • 533
  • 10
  • 24
0
votes
1 answer

sLDA. How much values response variable may have?

I try to understand in general how sLDA works. In contrast to LDA, it has 'a response variable associated with each document'. Is each document labeled just by one topic in training set or it might be labeled by multiple topics? If it must use just…
mariaza
  • 33
  • 3
0
votes
1 answer

Proper Mahout CVB max iteration count

When using the mahout cvb from the command line. what is the best way to determine to determine the iteration count number? -x is the argument to set it. The default appears to be 4 (from other readings), and the more iterations set, the accurate…
Chris S
  • 1
  • 2
0
votes
2 answers

How could we know the Dirichlet distribution is describing the topic rather than something else?

Dirichlet distribution is used in document modelling. I read from this article that: Different Dirichlet distributions can be used to model documents by different authors or documents on different topics. So how could we tell whether it is…
smwikipedia
  • 61,609
  • 92
  • 309
  • 482
0
votes
0 answers

Topic Modelling using RPy2

I wish to use LDA in Python using RPy. I have already tried this using gensim package but I still wish to try RPy2 out. While using R I use this code: library(RTextTools) library(topicmodels) library(tm) ...Get Data Here and Store to…
Animesh Pandey
  • 5,900
  • 13
  • 64
  • 130
0
votes
2 answers

Work-around to clear blank entries in a document term matrix?

I have some r code that I've used in the past to produce topic models. Everything was working fine until I updated all of my r packages in the hopes of fixing a slightly unrelated problem. Now, code which had previously worked seems to be broken…
beniam
  • 89
  • 1
  • 2
  • 5
0
votes
1 answer

R topic modeling - lda command 'lexicalize' giving unexpected results

I am using the 'lda' package in R to perform a topic model analysis of a corpus (let's call it 'corpusB'). I am preparing the corpus for the analysis by first using the command 'lexicalize', which returns a term-document matrix and, if not…
0
votes
1 answer

The meaning/implication of the matrices generated by Singular Value Decomposition (SVD) for Latent Semantic Analysis (LSA)

SVD is used in LSA to get the latent semantic information. I am confused about the interpretation about the SVD matrices. We first build a document-term matrix. And then use SVD to decompose it into 3 matrices. For example: The doc-term matrix M1 is…
smwikipedia
  • 61,609
  • 92
  • 309
  • 482
0
votes
1 answer

Any LDA code example in MatLab?

I would like to perform simple LDA on my small data set (65x8). I have 65 instances (samples) , 8 features (attributes) and 4 classes. Any matlab code for LDA , as I know Matlab Toolbox does not have LDA function So I need to write own code. Any…
bob
  • 363
  • 3
  • 8
  • 21
0
votes
2 answers

Fisher's Classification Function Coefficients for multiple classes in LDA in R

I have a small doubt in R pertaining to LDA, Like in spss when i tried to get fishers classification function coefficients of linear discriminant analysis in R with the package MASS, I am getting only coefficients of linear discriminant like the…
0
votes
1 answer

LDA - Recognition Pattern in Python (sklearn)

I am trying to execute this code on Python. This code refers to a LDA, from sklearn. import numpy as np from sklearn.lda import LDA X = np.array ([0.000000, 0.000000, 0.000000, 0.000000, 0.001550, 0.000000, 0.000000, 0.000000,…
gPxl
  • 95
  • 14
0
votes
0 answers

Read .lda file in Java?

I need to be able to read instructions from a .lda file (laser data file) for a project I need to work on. It's a type of binary file so provided I use the right text editor I can see the contents in hex but getting Java to read it is proving to be…
Nodal
  • 353
  • 2
  • 14
0
votes
1 answer

Matlab - LDA "The pooled covariance matrix of TRAINING must be positive definite."

Can someone help me out with this problem. I have trying to figure this out from a long time. I have a training_Set: <1530*270400 double> and Test_Set: <4794*270400 double> I am using Linear discriminant analysis method class =…
Om Choudhary
  • 492
  • 1
  • 7
  • 27
0
votes
1 answer

How to run lda using the jar files in mahout-distribution-0.7

I have several jar files, namely, mahout-integration-0.7.jar, mahout-math-0.7.jar, mahout-core-0.7.jar, mahout-core-0.7-job.jar, mahout-examples-0.7.jar and mahout-examples-0.7-job.jar. How do i run LDA by calling a certain jar file, such as what…
Kenneth Yang
  • 73
  • 1
  • 1
  • 7
0
votes
1 answer

About the inference result of Blei's lda-c-dist

I have a question about the inference result of lda-c-dist package. How many words should be displayed when viewing results of inference? For example, if I set number of words to a very large number N(assume number of all terms are N), it seems to…
Peiyun
  • 171
  • 1
  • 2
  • 13