3

I am trying to perform PCA reducing 900 dimensions to 10. So far I have:

covariancex = cov(labels);
[V, d] = eigs(covariancex, 40);

pcatrain = (trainingData - repmat(mean(traingData), 699, 1)) * V;
pcatest = (test - repmat(mean(trainingData), 225, 1)) * V;

Where labels are 1x699 labels for chars (1-26). trainingData is 699x900, 900-dimensional data for the images of 699 chars. test is 225x900, 225 900-dimensional chars.

Basically I want to reduce this down to 225x10 i.e. 10 dimensions but am kind of stuck at this point.

lennon310
  • 12,503
  • 11
  • 43
  • 61
user3094936
  • 263
  • 6
  • 12

2 Answers2

8

The covariance is supposed to implemented in your trainingData:

X = bsxfun(@minus, trainingData, mean(trainingData,1));           
covariancex = (X'*X)./(size(X,1)-1);                 

[V D] = eigs(covariancex, 10);   % reduce to 10 dimension

Xtest = bsxfun(@minus, test, mean(trainingData,1));  
pcatest = Xtest*V;
lennon310
  • 12,503
  • 11
  • 43
  • 61
  • Sorry just one more thing, I remember why I used 40 originally (although I do not need that many) because according to my lecturer, it is better to take dimensions 2:11 rather than 1:10, how would I achieve this? – user3094936 Dec 12 '13 at 19:25
  • [V D]=eigs(covariancex,11); pcatest=Xtest*V(:,2:11); – lennon310 Dec 12 '13 at 19:29
1

From your code it seems like you are taking the covariance of the labels, not the trainingData. I believe the point of PCA is in determining the greatest variance in some N (N = 10 here) number of subspaces of your data.

Your covariance matrix should be 900x900 (if 900 is the dimension of each image, a result of having 30x30 pixel images I assume.) Where the diagonal elements [i,i] of covaraincex gives the variance of that pixel for all training samples, and off diagonal [i,j] give the covariance between pixel i and pixel j. This should be a diagonal matrix as [i,j] == [j,i].

Furthermore when calling eigs(covariancex,N), N should be 10 instead of 40 if you want to reduce the dimension to 10.

Falimond
  • 608
  • 4
  • 11
  • Sorry, I remember why I used 40 originally (although I do not need that many) because according to my lecturer it is better to take dimensions 2:11 rather than 1:10, how would I achieve this? – user3094936 Dec 12 '13 at 19:25