I'm currently doing some research for a school assignment. I have two data streams, one is user ratings and the other is search, click and order history (binary data) of a webshop.
I found that collaborative filtering is the best family of algorithms if you are using rating data. I found and researched these algorithms:
Memory-based
user-based
- pearson correlation
- constrainted pearson
- vector similaritys (cosinus)
- Mean squared difference
- weighted pearson
- correlation threshold
- max number of neighbours
- weighted by correlation
- Z-score normalization
item-based
- adjusted cosine
- maximum number of neighbours
similarity fusion
model based
- regression based
- slope one
- lsi/svd
- regularized svd (rsvd/rsvd2/nsvd2/svd++)
- integrated neighbor based
- cluster based smoothing
Now I'm looking for a way to use the binary data, but I'm having a hard time figuring out if it is possible to use binary data instead of rating data with these algorithms or is there a different family of algorithms I should be looking at ?
I apologize in advance for spelling errors since I have dyslexia and am not a native writer.Thanks marc_s for helping.