Binary recommendation algorithms

Question

I'm currently doing some research for a school assignment. I have two data streams, one is user ratings and the other is search, click and order history (binary data) of a webshop.

I found that collaborative filtering is the best family of algorithms if you are using rating data. I found and researched these algorithms:

Memory-based

user-based
- pearson correlation
- constrainted pearson
- vector similaritys (cosinus)
- Mean squared difference
- weighted pearson
- correlation threshold
- max number of neighbours
- weighted by correlation
- Z-score normalization
item-based
- adjusted cosine
- maximum number of neighbours
similarity fusion

model based

regression based
slope one
lsi/svd
regularized svd (rsvd/rsvd2/nsvd2/svd++)
integrated neighbor based
cluster based smoothing

Now I'm looking for a way to use the binary data, but I'm having a hard time figuring out if it is possible to use binary data instead of rating data with these algorithms or is there a different family of algorithms I should be looking at ?

I apologize in advance for spelling errors since I have dyslexia and am not a native writer.Thanks marc_s for helping.

score 3 · Answer 1 · answered Sep 21 '15 at 20:30

Take a look at data mining algorithms such as association rule mining (aka market basket analysis). You've come upon a tough problem in recommendation systems: unary and binary data are common but the best algorithms for personalization don't work well with them. Rating data can represent preference for a single user-item pair; e.g., I rate this movie 4 stars out of 5. But with binary data, we have the least granular type of rating data: I either like or don't like something, or have or have not consumed it. Be careful not to confuse binary and unary data: unary data means that you have information that a user consumed something (which is coded as 1, much like binary data), but you have no information about whether a user didn't like or consume something (which is coded as NULL instead of binary data's 0). For instance, you may know that a person viewed 10 web pages, but you don't have any idea what she would have thought of other pages had she known they were available. That's unary data. You can't assume any preference information from NULL.

Binary recommendation algorithms

1 Answers1