I am getting confused trying to run PCA on a set of spatial grids that have been read into numpy arrays. As arrays they look like this, where mdata[0] represents the set of rows and columns in a single grid, and each value represents a grid cell:
mdata = np.array([ [[3, 4, 6, 4],
[4, 9, 2, 7],
[8, 7, 2, 2]],
[[1, 3, 7, 9],
[3, 1, 8, 8],
[2, 6, 7, 5]],
[[1, 8, 5, 2],
[6, 4, 7, 3],
[3, 1, 4, 6]],
[[5, 8, 2, 4],
[3, 7, 7, 7],
[1, 4, 8, 2]],
[[4, 5, 6, 4],
[1, 7, 5, 7],
[2, 1, 3, 3]] ])
I then flatten the arrays so that the covariance can be properly calculated across the set, so the input array then looks like this:
[[3 4 6 4 4 9 2 7 8 7 2 2]
[1 3 7 9 3 1 8 8 2 6 7 5]
[1 8 5 2 6 4 7 3 3 1 4 6]
[5 8 2 4 3 7 7 7 1 4 8 2]
[4 5 6 4 1 7 5 7 2 1 3 3]]
I need to generate a set of output arrays (ultimately converted back to grids) representing the principal components. The output arrays result from multiplying the eigenvectors against each array element across the set of input grids (i.e., the input array), and summing them together. Each output grid will represent a principal component, in decreasing order of explained variance.
So the final equations will look something like this:
First principal component (uses first eigenvector array row??):
(input_grid_1 * evec_row_1_val_1) + (input_grid_2 * evec_row_1_val_2) ... (input_grid_n * evec_row_1_val_n)
Second principal component (uses second eigenvector array row??):
(input_grid_1 * evec_row_2_val_1) + (input_grid_2 * evec_row_2_val_2) ... (input_grid_n * evec_row_2_val_n)
etc...
My question is... after generating the eigenvectors via numpy.linalg.eig, I'm not sure of their ordering. Do the values in evec[0] (see output below) represent the eigenvectors I should use to calculate the first principal component output array? Or should I start with the bottom row (i.e., evec[11] below)? Or are the eigenvectors arranged column-wise in the output from numpy.linalg.eig?
Thank you for any help, and I apologize for the length of this.
Here is some sample code I'm using, and the resulting output:
print "flattened mdata:\n", mdata2, "\n"
mdata2 -= np.mean(mdata2, axis=0)
print "mean centered data:\n", mdata2, "\n"
covar = np.cov(mdata2, rowvar=0)
print "covariance matrix:\n", covar, "\n"
eval, evec = la.eig(covar)
print "eigenvectors:\n", evec, "\n"
flattened mdata:
[[3 4 6 4 4 9 2 7 8 7 2 2]
[1 3 7 9 3 1 8 8 2 6 7 5]
[1 8 5 2 6 4 7 3 3 1 4 6]
[5 8 2 4 3 7 7 7 1 4 8 2]
[4 5 6 4 1 7 5 7 2 1 3 3]]
mean centered data:
[[ 0 -1 0 0 0 3 -3 0 4 3 -2 -1]
[-1 -2 1 4 0 -4 2 1 -1 2 2 1]
[-1 2 0 -2 2 -1 1 -3 0 -2 0 2]
[ 2 2 -3 0 0 1 1 0 -2 0 3 -1]
[ 1 0 0 0 -2 1 0 0 -1 -2 -1 0]]
covariance matrix:
[[ 1.7 0.95 -1.65 -0.6 -1. 2. -0.3 0.6 -1. -0.55 0.65 -1.3 ]
[ 0.95 3.2 -1.9 -3.1 1. 1.25 0.7 -1.9 -1.5 -2.8 0.9 0.2 ]
[-1.65 -1.9 2.3 1.2 0. -1.75 -0.15 0.05 1.25 0.6 -1.55 1.1 ]
[-0.6 -3.1 1.2 4.8 -1. -3.5 1.4 2.7 -1. 2.9 1.8 -0.1 ]
[-1. 1. 0. -1. 2. -1. 0.5 -1.5 0.5 0. 0.5 1. ]
[ 2. 1.25 -1.75 -3.5 -1. 7. -4.25 -0.25 3.25 0.25 -3. -2.5 ]
[-0.3 0.7 -0.15 1.4 0.5 -4.25 3.7 -0.15 -4. -1.8 3.15 1.45]
[ 0.6 -1.9 0.05 2.7 -1.5 -0.25 -0.15 2.3 -0.25 2.1 0.7 -1.15]
[-1. -1.5 1.25 -1. 0.5 3.25 -4. -0.25 5.5 3. -3.75 -0.75]
[-0.55 -2.8 0.6 2.9 0. 0.25 -1.8 2.1 3. 5.2 -0.1 -1.3 ]
[ 0.65 0.9 -1.55 1.8 0.5 -3. 3.15 0.7 -3.75 -0.1 4.3 0.15]
[-1.3 0.2 1.1 -0.1 1. -2.5 1.45 -1.15 -0.75 -1.3 0.15 1.7 ]]
eigenvectors:
[[-0.05226275 +0.00000000e+00j 0.14243804 +0.00000000e+00j
0.40669907 +0.00000000e+00j -0.08874805 +0.00000000e+00j
0.24983235 +0.00000000e+00j -0.07788035 +0.00000000e+00j
-0.20417938 +8.64129842e-02j -0.20417938 -8.64129842e-02j
0.17142025 -1.53009928e-02j 0.17142025 +1.53009928e-02j
-0.07359380 -5.42353843e-02j -0.07359380 +5.42353843e-02j]
[ 0.01313939 +0.00000000e+00j 0.46207747 +0.00000000e+00j
0.04053819 +0.00000000e+00j 0.24412764 +0.00000000e+00j
-0.29669270 +0.00000000e+00j -0.11848560 +0.00000000e+00j
-0.00129189 -1.08527235e-01j -0.00129189 +1.08527235e-01j
0.24951441 -4.06744086e-02j 0.24951441 +4.06744086e-02j
-0.08613229 -4.16143549e-02j -0.08613229 +4.16143549e-02j]
[ 0.01212470 +0.00000000e+00j -0.23533069 +0.00000000e+00j
-0.38627976 +0.00000000e+00j -0.29050062 +0.00000000e+00j
-0.04031410 +0.00000000e+00j 0.11914628 +0.00000000e+00j
-0.03397920 -4.04899870e-01j -0.03397920 +4.04899870e-01j
-0.02317506 +1.11224412e-01j -0.02317506 -1.11224412e-01j
0.16309959 +1.28363333e-02j 0.16309959 -1.28363333e-02j]
[ 0.25764595 +0.00000000e+00j -0.49039161 +0.00000000e+00j
0.17582472 +0.00000000e+00j -0.08694414 +0.00000000e+00j
-0.49772683 +0.00000000e+00j 0.04915654 +0.00000000e+00j
-0.22312499 +1.60190725e-02j -0.22312499 -1.60190725e-02j
-0.13318403 -2.09741709e-01j -0.13318403 +2.09741709e-01j
-0.06711105 +1.03673483e-01j -0.06711105 -1.03673483e-01j]
[ 0.04158039 +0.00000000e+00j 0.08806516 +0.00000000e+00j
-0.28689676 +0.00000000e+00j 0.55985795 +0.00000000e+00j
0.16139009 +0.00000000e+00j -0.31020587 +0.00000000e+00j
0.06748141 -1.77634044e-02j 0.06748141 +1.77634044e-02j
-0.06824769 -3.42630421e-01j -0.06824769 +3.42630421e-01j
-0.01037536 -1.65831604e-01j -0.01037536 +1.65831604e-01j]
[-0.55737468 +0.00000000e+00j 0.20542936 +0.00000000e+00j
0.32117797 +0.00000000e+00j -0.04132469 +0.00000000e+00j
-0.38616878 +0.00000000e+00j -0.07888938 +0.00000000e+00j
-0.18663356 -2.26138514e-01j -0.18663356 +2.26138514e-01j
-0.48555065 +0.00000000e+00j -0.48555065 +0.00000000e+00j
-0.05777089 +8.39094379e-02j -0.05777089 -8.39094379e-02j]
[ 0.44645722 +0.00000000e+00j 0.09345400 +0.00000000e+00j
-0.01958382 +0.00000000e+00j 0.00350717 +0.00000000e+00j
-0.49307361 +0.00000000e+00j -0.19705809 +0.00000000e+00j
-0.26393665 +1.65676777e-01j -0.26393665 -1.65676777e-01j
-0.07097582 +1.77803457e-01j -0.07097582 -1.77803457e-01j
0.31461793 -8.41424299e-02j 0.31461793 +8.41424299e-02j]
[ 0.03562756 +0.00000000e+00j -0.29183500 +0.00000000e+00j
0.34969990 +0.00000000e+00j -0.16770853 +0.00000000e+00j
0.18499349 +0.00000000e+00j -0.27031753 +0.00000000e+00j
0.04149287 -1.82755302e-01j 0.04149287 +1.82755302e-01j
0.27944344 -1.37299309e-01j 0.27944344 +1.37299309e-01j
-0.09637630 -4.39851477e-01j -0.09637630 +4.39851477e-01j]
[-0.46659969 +0.00000000e+00j -0.23849866 +0.00000000e+00j
-0.26877609 +0.00000000e+00j 0.24093339 +0.00000000e+00j
-0.35050288 +0.00000000e+00j 0.44403480 +0.00000000e+00j
-0.43610595 +0.00000000e+00j -0.43610595 +0.00000000e+00j
0.37131095 +1.81269636e-01j 0.37131095 -1.81269636e-01j
0.29146786 -2.22035640e-01j 0.29146786 +2.22035640e-01j]
[-0.13939660 +0.00000000e+00j -0.51461522 +0.00000000e+00j
0.15383834 +0.00000000e+00j 0.51033734 +0.00000000e+00j
0.13867222 +0.00000000e+00j -0.36401539 +0.00000000e+00j
0.19655716 +8.68943320e-02j 0.19655716 -8.68943320e-02j
-0.16900231 +9.74521096e-02j -0.16900231 -9.74521096e-02j
-0.20194713 +1.85123869e-01j -0.20194713 -1.85123869e-01j]
[ 0.39116291 +0.00000000e+00j 0.04835035 +0.00000000e+00j
0.32293099 +0.00000000e+00j 0.42167036 +0.00000000e+00j
-0.04351453 +0.00000000e+00j 0.61425451 +0.00000000e+00j
-0.20683758 -3.83249003e-01j -0.20683758 +3.83249003e-01j
-0.02544963 +2.31011753e-01j -0.02544963 -2.31011753e-01j
0.13276711 -1.26161298e-04j 0.13276711 +1.26161298e-04j]
[ 0.16560064 +0.00000000e+00j 0.04924581 +0.00000000e+00j
-0.38012669 +0.00000000e+00j -0.03145605 +0.00000000e+00j
0.06088502 +0.00000000e+00j -0.20480777 +0.00000000e+00j
-0.23192947 -1.44999255e-01j -0.23192947 +1.44999255e-01j
-0.27861106 -9.88363770e-05j -0.27861106 +9.88363770e-05j
-0.60558363 +0.00000000e+00j -0.60558363 +0.00000000e+00j]]