I am comparing eig and eigh and I would like clarification of why the results are partly inverted!
f1 and f2 are my 2 features.
pc1 and pc2 are the principal components using eig.
pc1h and pc2h are using eigh.
As can be seen, pc2 and pc2h appear very similar.
But pc1 appears inverted to pc1h. Also pc1h is similar to f2 in shape, which I would expect.
Why is pc1 upside-down?
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
#Generate a dummy dataset.
np.random.seed(300)
X = np.random.randint(10,50,100).reshape(50,2)
X[:,1] =X[:,1]*10
def PCA(X, num_components):
# Step-1
X_meaned = X - np.mean(X, axis=0)
# Step-2
cov_mat = np.cov(X_meaned, rowvar=False)
# Step-3
eigen_values, eigen_vectors = np.linalg.eig(cov_mat)
# Step-4
sorted_index = np.argsort(eigen_values)[::-1]
sorted_eigenvalue = eigen_values[sorted_index]
sorted_eigenvectors = eigen_vectors[:, sorted_index]
# Step-5
eigenvector_subset = sorted_eigenvectors[:, 0:num_components]
# Step-6
X_reduced = np.dot(eigenvector_subset.transpose(), X_meaned.transpose()).transpose()
return X_reduced
def PCAh(X, num_components):
# Step-1
X_meaned = X - np.mean(X, axis=0)
# Step-2
cov_mat = np.cov(X_meaned, rowvar=False)
# Step-3
eigen_values, eigen_vectors = np.linalg.eigh(cov_mat)
# Step-4
sorted_index = np.argsort(eigen_values)[::-1]
sorted_eigenvalue = eigen_values[sorted_index]
sorted_eigenvectors = eigen_vectors[:, sorted_index]
# Step-5
eigenvector_subset = sorted_eigenvectors[:, 0:num_components]
# Step-6
X_reduced = np.dot(eigenvector_subset.transpose(), X_meaned.transpose()).transpose()
return X_reduced
X_reduced = PCA(X,2)
X_reducedh = PCAh(X,2)
df_y = pd.DataFrame()
df_y['f1'] = X[:,0]
df_y['f2'] = X[:,1]
df_y['pc1'] = X_reduced[:,0]
df_y['pc2'] = X_reduced[:,1]
df_y['pc1h'] = X_reducedh[:,0]
df_y['pc2h'] = X_reducedh[:,1]
df_y.plot()
plt.show(block=True)