Highest Voted 'lsa' Questions

3

votes

1 answer

How to handle negative values of cosine similarities

I computed tf-idf of my documents based of terms. Then, I applied LSA to reduce the dimensionality of the terms. 'similarity_dist' contains values which are negative (see table below). How can I compute cosine distance with the range…

asked May 26 '16 at 07:53

kitchenprinzessin

1,023
3
14
30

3

votes

1 answer

How to compute word similarity using TF-IDF or LSA with gensim?

I know that word2vec in gensim can compute similarity between words. But now I want to compute word similarity using TF-IDF or LSA with gensim. How to do it? note: Computing document similarity using LSA with gensim is easy:…

python nlp tf-idf gensim lsa

asked Mar 14 '16 at 06:49

hankaixyz

96
1
5

3

votes

1 answer

Calling AuditQuerySystemPolicy() (advapi32.dll) from C# returns "The parameter is incorrect"

The sequence is like follows: Open a policy handle with LsaOpenPolicy() (not shown) Call LsaQueryInformationPolicy() to get the number of categories; For each category: Call AuditLookupCategoryGuidFromCategoryId() to turn the enum value into a…

c# marshalling unsafe advapi32 lsa

asked Jun 10 '10 at 17:17

JCCyC

16,140
11
48
75

3

votes

2 answers

Singular Value Decomposition: Different results with Jama, PColt and NumPy

I want to perform Singular Value Decomposition on a large (sparse) matrix. In order to choose the best(most accurate) library, I tried replicating the SVD example provided here using different Java and Python libraries. Strangely I am getting…

numpy svd lsa jama colt

asked Jul 16 '13 at 16:57

user2588219

31
1
2

3

votes

1 answer

Applying a function between specific pairs of columns in a matrix in R

I am generating a matrix using the lsa package in R. After the matrix is created, I would like to calculate the cosine similarity between specific pairs of documents (columns) in the matrix. Currently, I am doing this with nested for-loops, and it…

r matrix apply lsa

asked Apr 21 '13 at 12:28

E. Moritz

51
1
6

3

votes

1 answer

pLSA implementation for sparse matrix

I'm trying to implement the pLSA algorithm proposed by Thomas Hoffman (1999). However, all the implementations I have found consider the input term-doc matrix as complete instead of sparse. Since my input matrix is quite large and sparse, I would…

sparse-matrix lsa topic-modeling

asked Sep 11 '12 at 20:01

Jia

1,301
1
12
18

3

votes

1 answer

Latent Semantic Analysis/Indexing Library for C++

Is there a C++ library for LSA/LSI? Preferably MIT, BSD, Apache,... license - no GPL.

c++ nlp lsa

asked May 21 '12 at 12:31

snøreven

1,904
2
19
39

2

votes

0 answers

hklm\Security Vs Security\Policy

I am researching the way an attacker would get a machine credentials. I figured the most common methods are to dump hklm\sam hklm\security hklm\system I was able to figure what information is stored in the SAM and why would I want to save it,…

windows internals sysinternals lsa

asked Oct 17 '19 at 11:16

Knightwish

51
1
4

2

votes

0 answers

LSA and K means in document clustering, results are not printing correctly

I have recently done some document clustering using LSA then Kmeans. However when I try to print the most important words in each cluster im getting very strange results, it printing words that dont even below to that cluster. below is the code and…

cluster-analysis k-means lsa

asked Jul 07 '18 at 09:11

Brian Ly

21
1

2

votes

1 answer

How to get the vector representation of a word using a trained SVD model

I have trained (fit and transform) a SVD model using 400 documents as part of my effort to build a LSA model. Here is my code: tfidf_vectorizer = sklearn.feature_extraction.text.TfidfVectorizer(stop_words='english', use_idf=True,…

python scikit-learn svd lsa

asked Jun 18 '18 at 18:12

Pedram

2,421
4
31
49

2

votes

1 answer

Adding documents to gensim model

I have a class wrapping the various objects required for calculating LSI similarity: class SimilarityFiles: def __init__(self, file_name, tokenized_corpus, stoplist=None): if stoplist is None: self.filtered_corpus =…

python-3.x gensim lsa

asked Aug 15 '17 at 16:12

faerubin

177
12

2

votes

2 answers

Optimal Document Size for LSI Similarity Model

I'm using Gensim's excellent library to compute similarity queries on a corpus using LSI. However, I have a distinct feeling that the results could be better, and I'm trying to figure out whether I can adjust the corpus itself in order to improve…

gensim lsa

asked Aug 08 '17 at 16:08

faerubin

177
12

2

votes

1 answer

Implementing LSA for elasticsearch index

I've just spent the last couple days wrapping my head around implementing Latent Semantic Analysis for documents which are indexed in elasticsearch. the first step is to build the term-document matrix.So i think to use stanford nlp library that take…

java stanford-nlp elasticsearch-plugin elasticsearch-5 lsa

asked May 19 '17 at 10:55

Sara

57
1
2
11

2

votes

0 answers

How do I run LSA/SVD on a Spark DataFrame in a Pipeline?

I would like to be able to use the Pipeline functionality of Spark 2.0+ for building my models, but I cannot figure out how to incorporate LSA/SVD in my Pipeline. I am aware of the functionality on RDDs, but I do not believe that can be…

java apache-spark machine-learning svd lsa

asked Apr 13 '17 at 22:26

Thomas Hughes

53
6

2

votes

0 answers

How to print out the documents in each clusters generated by LDA?

The print_top_words method from the code below only prints the distribution of the words for each topic: Cluster 1: word1 , word2 , .... Cluster 2: word3 , word2 , .... So, instead of printing out the words distribution, I would like to print the…

python scikit-learn lda topic-modeling lsa

asked Jan 19 '17 at 07:28

Abood Al-mars

31
5

Questions tagged [lsa]