I am using PCA for face recognition. I have obtained the eigenvectors / eigenfaces for each image, which is a colomn matrix. I want to know if selecting the first three eigenvectors , since their corresponding eigenvalues amount to 70% of total variance, will be sufficient for face recognition?
2 Answers
Firstly, lets be clear about a few things. The eigenvectors are computed from the covariance matrix formed from the entire dataset i.e., you reshape each grayscale image of a face into a single column and treat it as a point in R^d space, compute the covariance matrix from them and compute the eigenvectors of the covariance matrix. These eigenvectors become a new basis for your space of face images. You do not have eigenvectors for each image. Instead, you represent each face image in terms of the eigenvectors by projecting onto (a possibly subset) of them.
Limitations of eigenfaces
As to whether the representation of your face images under this new basis good enough for face recognition depends on many factors. But in general, the eigenfaces method does not perform well for real world unconstrained faces. It only works for faces which are pixel-wise aligned, facing frontal, and has fairly uniform illumination conditions across the images.
More is not necessarily better
While it is commonly believed (when using PCA) that retaining more variance is better than less, things are much more complicated than that because of two factors: 1) Noise in real world data and 2) dimensionality of data. Sometimes projecting to a lower dimension and losing variance can actually produce better results.
Conclusion
Hence, my answer is it is difficult to say whether retaining a certain amount of variance is enough beforehand. The number of dimensions (and hence the number of eigenvectors to keep and the associated variance retained) should be determined by cross-validation. But ultimately, as I have mentioned above, eigenfaces is not a good method for face recognition unless you have a "nice" dataset. You might be slightly better off using "Fisherfaces" i.e., LDA on the face images or combine these methods with Local Binary Pattern (LBP) as features (instead of raw face pixels). But seriously, face recognition is a difficult problem and in general the state-of-the-art has not reached a stage where it can be deployed in real world systems.

- 10,031
- 4
- 47
- 55
-
Lightalchemist: First of all thank you for taking the effort to answer in detail. I was however confused when you said "You do not have eigenvectors for each image. " Cause i thought each image is associated with its own eigen values and eigen vectors. Please explain ? And i actually have what u described a "nice" dataset, that is why i am making use of PCA. – Sid Dec 25 '13 at 17:42
-
@Sid: You only have 1 set of eigenvectors for your entire dataset. They are the eigenvectors of the covariance matrix computed from the feature vectors extracted from your dataset. There is only 1 set of them whereas you said you "obtained the eigenvectors / eigenfaces for *each* image", giving the impression that you somehow computed a set of eigenvectors for *each* image. What you should have for each image is a vector containing the projection of your original feature vector onto these eigenvectors. Essentially each number in this vector is associated with an eigenvector and you get back – lightalchemist Dec 26 '13 at 02:03
-
@Sid (cont) your original feature by taking linear combination of your eigenvectors according to the numbers in the vector. In a way you are "blending" the eigenvectors according to the "weights" given in this vector. After reading your reply, I guess this could be what you mean by "each image is associated with its own eigen values and eigen vectors". – lightalchemist Dec 26 '13 at 02:06
-
Thank you again. I understood what you are saying about projection of feature vector onto the eigenvectors. Kind of like changing the basis? Just to be sure, what exactly will the principal components be? The projected feature vectors ? – Sid Dec 26 '13 at 16:39
-
1The principal components are the eigenvectors of your covariance matrix, not the projected feature vectors. It is a change of basis. Essentially the eigenvectors of your covariance matrix (i.e., principal components) forms a new orthonormal basis. Suppose the k of them associated with the k largest eigenvalues are placed as the columns of a matrix Uk. And suppose x is a column vector for a feature vector in the original space. Then x_proj = Uk' * x. Here The columns of Uk are the principal components and x_proj is x represented using the new basis. Now you classify using x_proj. – lightalchemist Dec 27 '13 at 02:21
-
1To reconstruct an approximation of x using x_proj, you can do: x_approx = Uk * x_proj. Notice that here you are reconstructing an approximation of x (because u only retained the top k basis vectors). Essentially the basis (eigenvectors/eigenfaces) here are the "building blocks" for your face. x_proj basically gives you the "mixing proportion/weights". Multiplying them together in that way simply gives a linear combination of the eigenvectors/eigenfaces to reconstruct your face. This is basically a fundamental way of thinking about linear algebra methods. It is also why the faces has to align. – lightalchemist Dec 27 '13 at 02:25
-
I have projected the column vectors from the original space to the facespace by multiplying into the eigenface basis's. So i obtained the weights or as you said x_proj. I would now like to reduce the dimensionality before i use it for training the SVM. How do i go about it? – Sid Dec 31 '13 at 08:34
-
1I'm assuming your eigenvectors are sorted in descending order of the eigenvalues. Then, simply set your projection matrix to Uk where the k columns corresponds to eigenvectors of the k largest eigenvalues. Then your dimension reduced vector is x_proj = Uk' * (x - m) where Uk' is Uk transposed, x is your input face, and m is the mean of your training data. This way your x_proj will only have k entries. Basically the eigenfaces approach is just performing PCA on face images. – lightalchemist Dec 31 '13 at 11:13
-
@ lightalchemist : It would be really helpful if you answered this question too- http://stackoverflow.com/questions/21427303/is-this-the-right-way-of-projecting-the-training-set-into-the-eigespace-matlab – Sid Jan 29 '14 at 14:18
-
@ lightalchemist : I have a basic doubt regarding SVM classification using PCA. Do i form different eigenspaces for the two different classes of the SVM? And if yes, which of the eigenspaces do i project the testimage into? – Sid Jan 31 '14 at 06:54
-
@Sid. No, you do nto form 2 separate subspaces, i.e. you do NOT compute 2 different sets eigenvectors for training samples from each class. You are suppose to create a SINGLE set of eigenvectors from the covariance matrix formed from ALL your training samples, and use that to project all your training samples. – lightalchemist Jan 31 '14 at 09:42
-
@Sid. Btw, next time you might want to ask this as a separate question. I'm not particular about this but there are others who don't like people to ask questions in the comment etc. The preference is for each post to be a SINGLE well defined question. – lightalchemist Jan 31 '14 at 09:54
-
Thank you for your answer and I'll remember the tip for next time. – Sid Jan 31 '14 at 10:16
-
Your help on this question will be really helpful to me. http://stackoverflow.com/questions/21474331/why-is-the-accuracy-coming-as-0-matlab-libsvm – Sid Feb 03 '14 at 14:44
It's not impossible, but a little rare to me that only 3 eigenvalues can achieve 70% variance. How many training samples do you have (what is the total dimension)? Make sure you are reshape each image from the database into a vector, normalize the vector data then align them into a matrix. The eigenvalues/eigenvectors are obtained from the covariance of the matrix.
In theory, 70% variance should be enough to form a human-recognizable face with the corresponding eigenvectors. However, the optimal number of eigenvalues is better to get from cross-validation: you can try to increase 1 eigenvector each time, observe the face formation and the recognition accuracy. You can even plot the cross validation accuracy curve, there may be a sharp corner on the curve, then the corresponding eigenvector number is hopefully applied in your test.

- 12,503
- 11
- 43
- 61
-
Lennon: Thank you for your answer. I will do the trail and error method to find the eigenvectors which are sufficient for recognition. – Sid Dec 25 '13 at 17:48