I'm trying to come up with a topic-based recommender system to suggest relevant text documents to users.
I trained a latent semantic indexing model, using gensim, on the wikipedia corpus. This lets me easily transform documents into the LSI topic distributions. My idea now is to represent users the same way. However, of course, users have a history of viewed articles, as well as ratings of articles.
So my question is: how to represent the users?
An idea I had is the following: represent a user as the aggregation of all the documents viewed. But how to take into account the rating?
Any ideas?
Thanks