Questions tagged [lsa]

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

For the Microsoft Windows subsystem, see (local-security-authority).

126 questions
1
vote
1 answer

R- reduce dimensionality LSA

I am following an example of svd, but I still don't know how to reduce the dimension of the final matrix: a <- round(runif(10)*100) dat <- as.matrix(iris[a,-5]) rownames(dat) <- c(1:10) s <- svd(dat) pc.use <- 1 recon <- s$u[,pc.use] %*%…
GabyLP
  • 3,649
  • 7
  • 45
  • 66
1
vote
0 answers

Create hierarchical relations between a set of terms

I need to form hierarchical relations between a set of terms(which may be entities, nouns,etc) by mining the web. This is along the lines of a taxonomy, However I need to be able to link Proper Nouns(people) and entities in a meaningful manner. Eg…
midi
  • 460
  • 3
  • 17
1
vote
1 answer

Add user to local security policy on Windows Server 2012

When using the code from LSA Functions Privileges and Impersonation on Windows Server 2008R2 it works fine to add a user to the logon as a service policy. When using this code on Windows Server 2012 it doesn't work. The function…
1
vote
0 answers

NLTK CorpusTerm by Document matrix

I am going to use CountVectorizer with a large corpus which I retrieve from Gutenberg (or any dat set from nltk) There are ebooks in tis corpus. I want to gather all sentences in those books in the same list. Something like…
Denis
  • 151
  • 1
  • 4
  • 11
1
vote
1 answer

Windows Password Filter DLL not loading

I am attempting to implement a very basic windows password filter in C++ based on the examples in this devx article However LSA is not loading the DLL (nothing in the loaded modules in msinfo32), despite the appropriate reg entry being set and the…
Dan J
  • 11
  • 4
1
vote
1 answer

How many singular values to keep in the R package lsa

I used the function lsa in the R package lsa to get the semantic space. The input is a term-document matrix. The problem is that the dimcalc_share() function used by lsa by default seems to be wrong. The help page of the function says the function…
de_cluster
  • 13
  • 3
1
vote
0 answers

Transforming CountVectorizer with entropy (log-entropy) / sklearn

I would like to try out some variations around Latent Semantic Analysis (LSA) with scikit-learn. Besides pure frequency counts from CountVectorizer() and the weighted result of TfidfTransformer(), I'd like to test weighting by entropy (and…
emiguevara
  • 1,359
  • 13
  • 26
1
vote
2 answers

How is the similarity between sentences calculated with LSA?

I have understood how LSA works when the similarity between words is calculated. I am using LSA from the website lsa.colorado.edu, but I cannot find a source how the similarity between sentences or multiple words is calculated. Is it just done by…
kumquatz
  • 11
  • 4
1
vote
1 answer

How to avoid error in textmatrix function in R's LSA package

I'm taking part in this Kaggle competition and I'm wondering if anyone has any familiarity with the textmatrix function from the LSA package in R. Basically, the textmatrix function accepts a directory as an argument and it will create a textmatrix…
user141146
  • 3,285
  • 7
  • 38
  • 54
0
votes
1 answer

Problems with svd in java

I have gone through jama and colt(I code in java) . Both of them expect me to use arrays such that the number of rows are more than the number of coloumns . But in case of the Latent semantic analysis (LSA) i have 5 books and there are a total of…
CTsiddharth
  • 907
  • 12
  • 21
0
votes
1 answer

Doubts regarding LSA

I have to find the similarity between a reference document and the set of documents in a repository . Method : 1. I find the term document matrix for all the documents including the reference document 2. The svd is calculated for this matrix 3.…
CTsiddharth
  • 907
  • 12
  • 21
0
votes
0 answers

How to create a Document Term Matrix in R (using LSA)?

I'm trying to build a document matrix using the LSA package for my research in R. The txt file I'm trying to read contains text from 10,000 tweets, and there is data in there. But loading TDM results in the error below. I'm using this package…
Alex
  • 1
  • 1
0
votes
0 answers

Semantic similarity function in R

I have the following function for code to measure the semantic similarity of the abstracts of two papers: cosine_similarity <- function(abstract1, abstract2) { # Create a document-term matrix from the two abstracts docs <- c(abstract1,…
Reza
  • 15
  • 6
0
votes
0 answers

Is it possible to use Latent Semantic Analysis on the fly?

I'm doing research about differents methods of word embedding. As far as I understand, Latent semantic Analysis is a way of reducing dimensions of a huge matrice built by counting words in documents (eventually normalized with things like tf-idf but…
0
votes
0 answers

is LSA + cosine similarity the right approach to get the similarity between 2 documents?

I am working on moodle plugin to autograde student essay answer with cosine similarity. The way it works is just to compare the key answer and student answer with cosine similarity and of course with its pre-processing such as text normalization,…
newtocoding
  • 97
  • 2
  • 5
1 2 3
8 9