Questions tagged [lsa]

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

For the Microsoft Windows subsystem, see (local-security-authority).

126 questions
1
vote
0 answers

Semantic Comparison between Sentence

I want to do a semantic comparison of sentences. For example, I have an input: "Trump has never been the president of the United States" I do web-scraping against this on newspapers and found, let suppose this result: "Trump is the president of…
oOXAam
  • 237
  • 1
  • 6
  • 20
1
vote
2 answers

Why is LSA in text2vec producing different results every time?

I was using latent semantic analysis in the text2vec package to generate word vectors and using transform to fit new data when I noticed something odd, the spaces not being lined up when trained on the same data. There appears to be some…
user3554004
  • 1,044
  • 9
  • 24
1
vote
1 answer

Latent Semantic Analysis results

I'm following a tutorial for LSA and having switched the example to a different list of strings, I'm not sure the code is working as expected. When I use the example-input as given in the tutorial, it produces sensible answers. However when I use…
Hews
  • 569
  • 6
  • 19
1
vote
0 answers

Explained variance in TruncatedSVD

as I tried to get my head around LSA, I discovered that I am not able to reproduce the result from TruncatedSVD using SVD. Why does this not work. Thank you for your help. import pandas as pd import numpy as np from sklearn.decomposition import…
OB1
  • 149
  • 2
  • 7
1
vote
0 answers

Latent Semantic Analysis: How to choose component number to perform TruncatedSVD

I am practicing to use LSA to classify Enron dataset (all emails). My understanding is to successfully perform any further classification or clustering, I need to perform a lower rank approximation using TruncatedSVD to maximize the variance. I have…
John Li
  • 43
  • 6
1
vote
0 answers

Calculate conceptual and relation similarity of two words in Java

I am implementing a readability formula in Java based on this paper. I reached the point where I have to compute the conceptual and the relational similarity of two or more words. They say: We use Latent Semantic Analysis (LSA) tools to compute…
João Alves
  • 185
  • 1
  • 5
  • 14
1
vote
1 answer

R LSA LSAFUN enconding problems

I would like to use the genericSummary function from package LSAfun. Here´s a german sample text. library("LSAfun") text = " Gegen die Firma wurde während der letzten Woche ein Zwangsvollstreckungsverfahren eingeleitet. Darüber witzeln die…
WinterMensch
  • 643
  • 1
  • 7
  • 17
1
vote
0 answers

Performing SVD Feature Decompostion on a Large Sparse Matrix

I saved my features from text data with pickle in sparse matrix format with a shape of (323549, 4119259). I am trying to perform Singular Value Decomposition on them using the sklearn library, however, I keep getting a memory error which suggests…
zzenonn
  • 58
  • 9
1
vote
0 answers

Latent text analysis (lsa package) using whole documents in R

I have a code that successfully performs Latent Text Analysis on short citations using the lsa package in R (see below). However, I would rather like to use this method on text from larger documents. Copy-pasting the whole thing in each citation…
Naomi Peer
  • 367
  • 2
  • 10
1
vote
1 answer

Document similarity using LSA in R

I am working on LSA (using R) for Document Similarity Analysis. Here are my steps Imported the text data & created Corpus. Did basis Corpus operations like stemming, white space removal etc Created LSA space as below tdm <-…
Sreenath1986
  • 167
  • 4
  • 16
1
vote
0 answers

Semantic search in ElasticSearch

What is the best way to add semantics in ES? I have read this: Semantic search with NLP and elasticsearch ,but there are lot of manual things here and on top of that this is quite old. For eg: Knowing list of topics and what topic a document belongs…
Praveen
  • 338
  • 2
  • 11
1
vote
1 answer

How to cluster documents under topics using latent semantic analysis (lsa)

I've been working on latent semantic analysis (lsa) and applied this example: https://radimrehurek.com/gensim/tut2.html It includes the terms clustering under topics but couldn't find anything how we can cluster documents under topics. In that…
user3288051
  • 574
  • 1
  • 11
  • 28
1
vote
1 answer

R: how to map test data into lsa space created by training data

I am trying to do text analysis using LSA. I've read many other posts regarding LSA on StackOverflow, but I have not found one similar to mine yet. IF you know there's one similar to mine, please kindly redirect me to it! Much appreciated! here's my…
alwaysaskingquestions
  • 1,595
  • 5
  • 22
  • 49
1
vote
1 answer

Text clustering application meaning

On the scikit-learn site there is an example of k-means applied to text mining. The excerpt of interest is below: if opts.n_components: original_space_centroids = svd.inverse_transform(km.cluster_centers_) order_centroids =…
1
vote
0 answers

Does Gensim handle multi-word terms when processing Wikipedia corpus?

I was reading the Experiments on the English Wikipedia tutorial and noticed that many of the topics generated by LSA and LDA contained multi-word terms that had clearly been concatenated e.g. northamerica, hockeyarchives Could someone indicate where…
DTailor
  • 11
  • 3
1 2 3
8 9