0

When decomposing a rating matrix for recommender system, the rating matrix can be written as P* t(Q), which P represents user factor matrix and Q represents item factor matrix. The dimension of Q can be written as rank*number of items. I am wondering if the values in the Q matrix actually represent anything, such as the weight of the item? And also, is there any way to find out some hidden patterns in the Q matrix?

1 Answers1

0

Think of features as the important direction of variance in multidimensional data. Imagine a 3-d chart plotting which of 3 items the user bought. It would be an amorphous blob but the actual axis or orientation of the blob is probably not along the x,y,z axises. The vectors that it does orient along are the features in vector form. Take this to huge dimensional data (many users, many items) and this high-dimensional data very often can be spanned by a small number of vectors, most variance not along these new axises is very small and may even be noise. So an algorithm like ALS finds these few vectors that represent most of the span of data. Therefore "features" can be thought of as the primary modes of variance in the data or put another way, the archetypes for describing how one item differs from another.

Note that PQ factorization in recommenders relies on dropping insignificant features to achieve potentially huge compression of the data. These insignificant features (ones that account for very little variance in the user/items input) can be dropped because they often are interpreted as noise and in practice yield better results for being discarded.

Can you find hidden patterns; sure. The new smaller but dense item and user vectors can be treated with techniques like clustering, KNN, etc. They are just vectors in a new "space" defined by the new basis vectors--the new axises. When you want to interpret the result of such operations you will need to transform them back into item & user space.

The essence of ALS (PQ matrix factorization) is to transform the user's feature vector into item space and rank by the item weights. The highest ranked items are recommended.

pferrel
  • 5,673
  • 5
  • 30
  • 41