Questions tagged [lsa]

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

For the Microsoft Windows subsystem, see (local-security-authority).

126 questions
2
votes
0 answers

override similarity with Lucene and use LSA+SVD instead

I'm working on an existed project using Lucene for searching and returning matches. It's not using any custom analyzer or any external algorithm. The documents are tiny with rows of no more than 50 words, thus I know LSA AND SVD will work better…
Lelo
  • 347
  • 3
  • 16
2
votes
0 answers

R: How to perform lsa() with parallel processing format

I am trying to do some text analytic on tweets, and trying to use LSA() for DR. However, seems like calculating lsa space is EXTREMELY memory intensive. I can only process up to 2.3k tweets or my computer will die. As I researched through online…
alwaysaskingquestions
  • 1,595
  • 5
  • 22
  • 49
2
votes
1 answer

Using the lsa package in R - Error in Ops.simple_triplet_matrix(m, 1) : Incompatible dimensions

I am trying to learn to use the lsa package in R. I am working with a much larger data set than the example below, but this is for the purposes of reproducibility (props to this person for posting this code on his site, it's a great resource). I…
2
votes
2 answers

Compute cosine similarities between documents in semantic space, using R-lsa package

I'm trying to cluster similar documents using the R language. As a first step, I compute the term-document matrix for my set of documents. Then I create the latent semantic space for the term-document matrix previously created. I decided to use use…
lucasbls1
  • 83
  • 2
  • 5
2
votes
2 answers

Why is in OSPF LSA sequence nuber in range 0x80000001 to 0x7FFFFFFF

Why is in OSPF LSA sequence number in range 0x80000001 to 0x7FFFFFFF. I suppose that it is for some historical reasons but cannot google it.
Jan Pluskal
  • 103
  • 1
  • 5
1
vote
0 answers

SSPI/LSA Authentication

Idea is to get a company domain credentials, I am trying to authenticate users which is using RAS cards and VPN to join a network. I found a code which is doing something similar, but for some reason it returns only local credentials.But I am…
Wild Goat
  • 3,509
  • 12
  • 46
  • 87
1
vote
1 answer

What is the role of latent semantic analysis in developing search engines?

I am trying to develop a music-focused search engine for my final year project.I have been doing some research on Latent Semantic Analysis and how it works on the Internet. I am having trouble understanding where LSI sits exactly in the whole system…
Deepankar Bajpeyi
  • 5,661
  • 11
  • 44
  • 64
1
vote
1 answer

Problems using Jama in java for LSA

i am making using of the jama package for finding the lsa . I was told to reduce the dimensionality and hence i have reduced it to 3 in this case and i reconstruct the matrix . But the resultant matrix is very different from the one i had given to…
CTsiddharth
  • 907
  • 12
  • 21
1
vote
0 answers

Buffer has wrong number of dimensions (expected 1, got 2). How to fit the dimensions problem?

import umap.umap_ as umap #Uniform Manifold Approximation and Projection,find out how distinct our topics are #https://umap-learn.readthedocs.io/en/latest/ embedding = umap.UMAP(n_neighbors=150, min_dist=0.5,random_state=12).fit_transform(X_topics)…
1
vote
1 answer

Google Local Service API - 500 Internal Server Error

Since last Friday, I am getting a 500 Internal Server ERROR while using Google Local Services API; through Auth2.0 Method as per the documentation. https://services.google.com/fh/files/helpcenter/lsa_api_dev_guide.pdf I am sending the following…
1
vote
0 answers

Will LSA work well on a corpus of documents of significantly different sizes?

I have to assess pairwise similarities of documents of different sizes (from 300 words to more than 200k words). To do so, I have created a procedure making use of LSA algorithm as implemented in gensim. It includes these steps: document…
labelled
  • 11
  • 2
1
vote
2 answers

Pooling Method in TREC competitions

This is a very fundamental and silly doubt. I have read that in order to prevent large relevance assessments in TREC competitions (reference), the top-ranked documents returned by participating systems are pooled to create the set of documents for…
1
vote
2 answers

How do I convert this print statement into a data frame? Python NLP LSA topics

I need to add these LSA topics to each corresponding topic in my data frame. How can I get this print statement output in a data frame? --> I am trying to get a data frame with the topic numbers and their corresponding keywords in a different…
1
vote
1 answer

In count vectorizer which axis to use?

I want to create a document term matrix. In my case it is not like documents x words but it is sentences x words so the sentences will act as the documents. I am using 'l2' normalization post doc-term matrix creation. The term count is important for…
1
vote
2 answers

Factors Analysis using MDP in Python

Excuse my ignorance, I'm very new to Python. I'm trying to perform factor analysis in Python using MDP (though I can use another library if there's a better solution). I have an m by n matrix (called matrix) and I tried to do: import…
Jeff
  • 12,147
  • 10
  • 51
  • 87
1 2
3
8 9