I need to perform Principal Components Analysis in a dataset, I am using octave 6.4.0, so it should not be a problem related to outdated software. Anyway, the function I am using relies on a very well tested so I proceeded as usual:
- Compute the covariance matrix.
- Compute the eigenvalues and eigenvectors, in this case using eigs.
- Perform the analysis by projecting the centered data (removing mean) using the computed eigenvectors.
And after doing that, I observed weird behavior of the projected data, so I inspected the eigenvectors. What I found was quite shocking. Basically, the returned eigenvectors did not have unitary norm (none of them), which is already a bad symptom. I checked the returned flag, but surprisingly, it was 0. So, no problem detected, although the returned eigenvectors have a problem, obviously. Then, I asked eigs to iteratively compute only the first k eigenvectors, with k = {1, ..., rank(C_{xx})}, where C_{xx} is the covariance matrix. It is the case that this matrix is full-ranked (at least, the returned minimum eigenvalue is way bigger than 0 when k == cols(C_{xx}) == rows(C_{xx}), so it should not be a problem. I have not found any similar problem, and this is the first time I find this problem after applying this technique and using this eigs function numerous times.
Here you can find the code that demonstrates this weird behavior:
for K=1:64
[V, E] = eigs(templates_train_covariance_matrix, K);
fprintf(1, 'K = %i; norm(V1) = %f\n', K, norm(V(:,1)));
endfor
K = 1; norm(V1) = 1.000000
K = 2; norm(V1) = 1.000000
K = 3; norm(V1) = 1.000000
K = 4; norm(V1) = 1.000000
K = 5; norm(V1) = 1.000000
K = 6; norm(V1) = 1.000000
K = 7; norm(V1) = 1.000000
K = 8; norm(V1) = 1.000000
K = 9; norm(V1) = 1.000000
K = 10; norm(V1) = 1.000000
K = 11; norm(V1) = 1.000000
K = 12; norm(V1) = 1.000000
K = 13; norm(V1) = 1.000000
K = 14; norm(V1) = 1.000000
K = 15; norm(V1) = 1.000000
K = 16; norm(V1) = 1.000000
K = 17; norm(V1) = 1.000000
K = 18; norm(V1) = 1.000000
K = 19; norm(V1) = 1.000000
K = 20; norm(V1) = 1.000000
K = 21; norm(V1) = 1.000000
K = 22; norm(V1) = 1.000000
K = 23; norm(V1) = 1.000000
K = 24; norm(V1) = 1.000000
K = 25; norm(V1) = 1.000000
K = 26; norm(V1) = 1.000000
K = 27; norm(V1) = 1.000000
K = 28; norm(V1) = 1.000000
K = 29; norm(V1) = 1.000000
K = 30; norm(V1) = 1.000000
K = 31; norm(V1) = 1.000000
K = 32; norm(V1) = 502385811859.998413
K = 33; norm(V1) = 19456763893.455475
K = 34; norm(V1) = 69151.227303
K = 35; norm(V1) = 60981968713.719788
K = 36; norm(V1) = 13353210849.067781
K = 37; norm(V1) = 27614930720.959068
K = 38; norm(V1) = 648885167180.564331
K = 39; norm(V1) = 1078167.777569
K = 40; norm(V1) = 29009353767.082512
K = 41; norm(V1) = 36887852559.354523
K = 42; norm(V1) = 11190161.502876
K = 43; norm(V1) = 67319513050.957047
K = 44; norm(V1) = 26536642028.965473
K = 45; norm(V1) = 21020237947.192379
K = 46; norm(V1) = 79513212336.981018
K = 47; norm(V1) = 31980738852.512493
K = 48; norm(V1) = 4629669543880.510742
K = 49; norm(V1) = 36066934562.318390
K = 50; norm(V1) = 39489763493.994087
K = 51; norm(V1) = 1015181634977.895996
K = 52; norm(V1) = 12591902465.720659
K = 53; norm(V1) = 292456782831.118652
K = 54; norm(V1) = 206117610126.212677
K = 55; norm(V1) = 218166497201.257477
K = 56; norm(V1) = 1059363.341610
K = 57; norm(V1) = 309852.877416
K = 58; norm(V1) = 404729039821.097961
K = 59; norm(V1) = 1289867814341.026123
K = 60; norm(V1) = 35811264888.668854
K = 61; norm(V1) = 72841577175.781281
K = 62; norm(V1) = 19267635.939175
K = 63; norm(V1) = 14730.384760
K = 64; norm(V1) = 87973266862.405182
I can also share the problematic covariance matrix if anyone wants to reproduce this.
Any idea of why this may be happening?