0

I have movielens dataset which I want to apply dimensionality reduction using PCA algorithm, first I compute convenience matrix of dataset, then compute eigenvalue of my dataset but; here is the problem when I print the result I don't understand which eigenvalue belong to which movie I use numpy for computing eigenvalues.
Here is my code

#Load movie names and movie ratings
movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')
ratings.drop(['timestamp'], axis=1, inplace=True)
def replace_name(x):
    return movies[movies['movieId']==x].title.values[0]
ratings.movieId = ratings.movieId.map(replace_name)
M = ratings.pivot_table(index=['userId'], columns=['movieId'], values='rating')
m = M.shape
df1 = M.replace(np.nan, 0, regex=True) 
#Perform eigendecomposition on covariance matrix
cov_mat = np.cov(X_std.T)
eig_vals, eig_vecs = np.linalg.eig(cov_mat)
print('\nEigenvalues \n%s' %eig_vals)

Number of eigenvalue which my code produce is equal to number of movies but I don't which eigenvalue belong to which movie?

Daniel.V
  • 2,322
  • 7
  • 28
  • 58
  • Eigenvalues do not belong to a specific row or column in the matrix, but to eigenvectors. – smernst Nov 28 '16 at 10:55
  • As I know in pca algorithm we compute eigenvalues and convectors for each feature which here is my movies – Daniel.V Nov 28 '16 at 11:05
  • Eigenvalues and eigenvectors are not calculated for each feature separately but for the set as a whole. The eigenvectors are combinations of features, or directions in your feature space. – smernst Nov 28 '16 at 11:24
  • So how should I apply pca on this dataset inorder to reduce movies – Daniel.V Nov 28 '16 at 11:48
  • Maybe you can explain what you ultimately want to do. Usually you use PCA in a scenario where you want to predict a label by a number of feauters. If your dataset has many features you use it to decide which features to pick. I your case I don't see many features, but just the movie name and the corresponding rating. – molig Nov 28 '16 at 18:13

0 Answers0