0

I have an example word by document matrix (from Landauer and Dumais, 1997):

wxd <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,
            0,0,1,1,1,1,1,0,1,0,0,0,
            0,1,0,1,1,0,0,1,0,0,0,0,
            1,0,0,0,2,0,0,1,0,0,0,0,
            0,0,0,1,0,1,1,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,1,0,0,
            0,0,0,0,0,0,0,0,0,1,1,0,
            0,0,0,0,0,0,0,0,0,1,1,1,
            0,0,0,0,0,0,0,0,1,0,1,1)
          ,12, 9)
rownames(wxd) <- c("human", "interface", "computer", "user", "system", 
               "response", "time", "EPS", "survey", "trees", "graph", "minors")
colnames(wxd) <- c(paste0("c", 1:5), paste0("m", 1:4))

I can perform Singular Value Decomposition on this matrix using the svd() function and have three matrices U, S, and V:

SVD <- svd(wxd)
U <- SVD$u
S <- diag(SVD$d)
V <- SVD$v

I can multiply these matrices and get my original matrix returned (within some small margin or error):

U %*% S %*% t(V)

I can also take the first two columns of the U and V matrices and the first two columns and rows of the S matrix to get the least squares best approximation of the original data. This fits with the results of the same procedure in the paper I mentioned above:

U[ , 1:2] %*% S[1:2, 1:2] %*% t(V[ , 1:2])

I am wanting to make sure I understand what this function is doing (as best as I am able), and I have been able to generate the V and S matrices to match those from the svd() function:

ATA <- t(wxd) %*% wxd
V2 <- eigen(ATA)$vectors

S2 <- sqrt(diag(eigen(ATA)$values))

But, the U matrix I generate has the same absolute values for the first 9 columns then adds an additional 3 columns. And some elements of this U matrix have different signs than the U matrix from the svd() function:

AAT <- wxd %*% t(wxd)
U2 <- eigen(AAT)$vectors

So my question is, why is the U matrix different than when I attempt to calculate it from scratch?

Cyrus Mohammadian
  • 4,982
  • 6
  • 33
  • 62
snarble
  • 325
  • 1
  • 3
  • 14

1 Answers1

2

wxd has rank of 9. Therefore, your AAT only has 9 non-zero eigenvalues (the rest are very small ~1e-16). For those zero eigenvalues, the eigenvectors are arbitrary as long as they span the subspace orthogonal to that spanned by the other eigenvectors in R^12.

Now, by default svd only computes nu=min(n,p) left singular vectors (similarly for right eigenvectors) where n is the number of rows and p is the number of columns in the input (see ?svd). Therefore, you only get 9 left singular vectors. To generate all 12, call svd with:

svd(wxd,nu=nrow(wxd))

However, those extra 3 left singular vectors will not correspond to those found with eigen(AAT)$vectors again because these eigenvectors are determined somewhat arbitrarily to span that orthogonal subspace.

As for why some of the signs have changed, recall that eigenvectors are only determined up to a scale factor. Although these eigenvectors are normalized, they may differ by a factor of -1. To check just divide one from U with the corresponding one from U2. You should get columns of all 1s or -1s:

U[,1:9]/U2[,1:9]
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,]    1   -1    1   -1    1   -1    1    1    1
## [2,]    1   -1    1   -1    1   -1    1    1    1
## [3,]    1   -1    1   -1    1   -1    1    1    1
## [4,]    1   -1    1   -1    1   -1    1    1    1
## [5,]    1   -1    1   -1    1   -1    1    1    1
## [6,]    1   -1    1   -1    1   -1    1    1    1
## [7,]    1   -1    1   -1    1   -1    1    1    1
## [8,]    1   -1    1   -1    1   -1    1    1    1
## [9,]    1   -1    1   -1    1   -1    1    1    1
##[10,]    1   -1    1   -1    1   -1    1    1    1
##[11,]    1   -1    1   -1    1   -1    1    1    1
##[12,]    1   -1    1   -1    1   -1    1    1    1

Update to explain why Eigenvector is only determined up to a scale factor

This can be seen from the definition of the eigenvector. From Wikipedia,

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a non-zero vector that does not change its direction when that linear transformation is applied to it.

In finite-dimensional vector space, the linear transformation is in terms of multiplying the vector with a square matrix A, and therefore the definition is (This is where I wish SO supports LaTeX markdown as this is not an equation in code; that is * is matrix-multiply here):

A * v = lambda * v

which is known as the Eigenvalue Equation for the matrix A where lambda is the eigenvalue associated with the eigenvector v. From this equation, it is clear that if v is an eigenvector of A then any k * v for some scalar k is also an eigenvector of A with associated eigenvalue lambda.

Community
  • 1
  • 1
aichao
  • 7,375
  • 3
  • 16
  • 18
  • Thank you so much, I think I understand. So If I run `SVD <- svd(wxd,nu=nrow(wxd))` I can put my original matrix back together by using the first 9 columns of U (eg., `U[,1:9] %*% S %*% t(V)`). But, this doesn't seem to work with U2 (e.g., `U2[,1:9] %*% S2 %*% t(V2)`). Am I missing something? – snarble Sep 09 '16 at 18:51
  • @snarble: that is because unlike `svd`, `U2` and `V2` are computed via separate calls to `eigen` using `AAT` and `ATA` so that the left and right eigenvectors may be "mutually scaled" differently by "+/-1". If we correct this by multiplying `U2[,1:9]` by the correction `U[,1:9]/U2[,1:9]`, which are columns of all `1`s or `-1`s, and multiplying `V2` by `V/V2`, then we trivially get `(U2[,1:9] * U[,1:9] / U2[,1:9]) %*% S2 %*% t(V2 * V / V2)` which will recover `wxd`. – aichao Sep 09 '16 at 19:23
  • That is very helpful. How would I make sure my two matrices `U` and `V` are mutually scaled to be identical to the `svd` function? – snarble Sep 09 '16 at 19:51
  • @snarble: I don't believe you can. Let me ask you this: what are you really trying to achieve by recovering `wxd` from `U2` and `V2`? – aichao Sep 09 '16 at 19:56
  • Okay, thanks. I would like to write my own function to perform SVD so I can be sure I know what is happening, or at the very least I want to completely understand what is happening when I run `svd(wxd)`. – snarble Sep 09 '16 at 20:03
  • @snarble: I would seriously advise you **not** to do that. There are many numeric nuances to an implementation and even if you do it, no one will care because it has already been done in LAPACK. In fact, R links to LAPACK, which is written in Fortran. Instead I would read a good classic book on the topic such as Golub and Van Loan, *Matrix Computations*, 1983 to understand the algorithm. – aichao Sep 09 '16 at 20:08
  • Is there anyway you could go into a little bit more detail when you said "recall that eigenvectors are only determined up to a scale factor"? – snarble Sep 11 '16 at 05:59