1

I am a newbie learning mahout.

I learned that there are five recommenders in mahout. User-based, Item-based,...

The datasets I used is movielens 100K

I am thinking implement a little different movie recommender from user based one. i.e., instead of taking user id as an input to recommend movies to only one user, I want to take user demographic information, e.g., age range, gender, occupation, and zip code.

But the problem is how do I create my own user similarity method (The original one is taking two long type user id as parameters) and how do I combine u.user file and u.data file together?

Taryn
  • 242,637
  • 56
  • 362
  • 405
vycon
  • 139
  • 2
  • 12

2 Answers2

1

I understand your question now. I think the simplest thing is to temporary create a dummy user with the demographic properties you are querying for, and then recommend for that dummy user.

Yes, you would have to write a UserSimilarity that implements whatever similarity rule you want on top of the demographic data.

Sean Owen
  • 66,182
  • 23
  • 141
  • 173
  • Thanks, Sean! I also got your email. I am still not quite clear. Do you mean I need to create a temp id for this user? How do I combine these two dataset file (u.user and u.data) together to make recommendation? Thanks. – vycon May 28 '11 at 16:58
  • Let's follow up on the user@ list since more discussion is there. – Sean Owen May 28 '11 at 18:38
  • It seems that my subscription has not been approved. – vycon May 29 '11 at 02:17
  • There is no moderation for subscribers. You posted a message and it was received. – Sean Owen May 29 '11 at 07:51
1

Maybe there is another solution.

I implement my own Rescorer to deal with u.user file and input (gender, age range, ...). If each piece of information is equal, then I put the according user id into a FastIDSet.

Then, in the rescore method, I will check if the current user id is in FastIDSet, if yes, the augment the score.

In my own Recommender, I will use PlusAnoymousUserDataModel to get a temp id, and call the method recommen(id, howMany, rescorer)

However, after I tried different dataset file, I get 0 recommended item.

I am thinking whether it is the right way to use PlusAnoymousUserDataModel.

vycon
  • 139
  • 2
  • 12