0

I am designing an application that incorporates a recommendation system base on user interactions (collaborative filtering). The user on his homepage is presented a set of 6 items to interact with. There will be between 50 and 300 items. The following actions are possible:

  1. click on an item (strong interest)
  2. refresh an item (some interest)
  3. open a read-more dialog (some interest)
  4. don't do anything an move on (no interest)

This data is collected and stored. The system should recommend items of interest to the user. I'am thinking about turning this data into a rating system.

Option A) if the user clicks on an item, this is translated into a implicit lifetime rating of 5. refreshing an item it a 4 and so on. So my user->item matrix would look like this:

       item 1 | item 2 | item 3
john   5                 4
jane   4

In this example john has clicked on item 1 and refreshed item 3. The rating can only go up really, i.e. if a user has previously refreshed an item I write a 4 and update only to a 5 if the item is clicked later.

Option B) each time the user does one of the above actions, I'll increment a scalar value for the item, which means it can grow unbounded.

       item 1 | item 2 | item 3
john   55       1        30
jane   41       9

Maybe this is a problem, since now the numbers are harder to translate into a rating scale from 1 to 10

Option C) I count every interaction separately

       item 1 click | item 1 refresh | item 1 read
john   3              1                       
jane   1                               1

Here the problem is that "reading about" an item is probably only done once.

Independent of whatever option I choose, my idea is to first find similar users using something like cosine similarity or pearson correlation. Then pick the top 10 to 30 users from that list and compile a toplist of their favorite items. From that list, I will then recommend items that the current user has had little interaction with in the past.

Is this something that could work? I am worried that finding similar users will eliminate the chance of finding interesting (new) items for the current user.

reikje
  • 2,850
  • 2
  • 24
  • 44

1 Answers1

0

What you suggest sounds reasonable. Your concern about not finding new items is a reflection of the collaborative filtering method which is metadata-based. To find new items you would have to undoubtedly do some content analysis which would be a separate stage. For example, if your items are news articles you might try to identify important keywords for each user.

dan
  • 982
  • 8
  • 24