I have a question with regard to the predict function of the FactorizationRecommender.
At my disposal, I have a large dataset with user item pairs (and a binary rating for each pair). Important to note is that users have not interacted with all items (the rating matrix is very sparse).
Subsequently, I remove all ratings of one user (I choose him/her to be the cold user) from the dataset. On all remaining user item pairs I train a matrix factorization model (factorization_recommender.create(...,binary_target=True)
).
Now, I would like to make predictions for the remaining ratings of the cold user when I show the model a fraction of the cold user's ratings (e.g., I show the model 10 of the cold user's ratings and want to compute predicted ratings for all other items). Next I want to compute the RMSE of the predictions ONLY for the cold user.
My question is two-fold. First of all, it is not entirely clear to me which arguments to pass to the FactorizationRecommender.predict
function.
The fraction of the user item pairs (and binary ratings) that I want to show to the model (e.g., the 10 ratings), should these be the new_observation_data
? And what should my input be for the dataset
? The initial training dataset?
Secondly, my question is how the FactorizationRecommender.predict
function precisely works (what's happening in the background)? How can you make predictions on a user that is not included in the initial training dataset? As the latent factors of the factorization are not built for this user, how are his/her predictions made?
My current version of GraphLab Create is v1.10.1.
Thanks for your help!