The meaning/implication of the matrices generated by Singular Value Decomposition (SVD) for Latent Semantic Analysis (LSA)

Question

SVD is used in LSA to get the latent semantic information. I am confused about the interpretation about the SVD matrices.

We first build a document-term matrix. And then use SVD to decompose it into 3 matrices.

For example:

The doc-term matrix M1 is M x N, where:

M = the number of documents
N = the number of terms

And M1 was decomposed into:

M1 = M2 * M3 * M4, where:

M2: M x k

M3: k x k

M4: k x N

I see the interpretation like below:

The k column of M2 stands for categories of similar semantics. The k row of M4 stands for the topics.

My questions are:

Why is k interpreted like above? How do we know it is similar semantics and topics?
Why the similar semantics equal the topics?
Why k is interpreted differently between M2 and M4
How to interpret the M3?

I am really confused. It seems the interpretation is totally arbitrary. Is that what latent meant to be?

If `SVD` seems too arbitrary, try using `PCA` instead. They're effectively equivalent, but `PCA` is much easier to convince yourself of and can help explain a lot of the reasoning behind `SVD` interpretation. A full explanation of `SVD` should either be on math exchange, or constitute most of a linear algebra course. — Slater Victoroff, Jan 08 '14 at 07:23
I think the question was about why k apparently has a similar-sounding-yet-differently-named interpretation in the different matrices. Does it really? — Heather Stark, Jan 08 '14 at 14:12
@HeatherStark yes, that should be one of my concerns, too. Thanks for pointing it out. I updated the question. — smwikipedia, Jan 09 '14 at 07:29

score 1 · Accepted Answer · answered Jan 10 '14 at 03:31

1

I warmly recommend reading the information retrieval chapter in the SNLP bible by Manning and Schutze. In 5 pages it explains everything you want to know about LSI and SVD.

You will find paragraphs like this :

enter image description here

answered Jan 10 '14 at 03:31

jhegedus

20,244
16
99
167

Thanks for the recommendation of this book. I didn't read it before. I will read it. – smwikipedia Jan 10 '14 at 05:56

The meaning/implication of the matrices generated by Singular Value Decomposition (SVD) for Latent Semantic Analysis (LSA)

1 Answers1