I'm trying to find a way, using apache Mahout, to recommend similar users and not Items.
I have a list of Users each of them have read certain books. I wanted to ask if there is a way to recommend a group of users to another user based on what he read.
As you can understand, the recommended users would have read some of the same books.
Thanks for your help and your guidance.
Asked
Active
Viewed 787 times
2

paskun
- 89
- 1
- 6
-
Any help or ideas please? – paskun Nov 21 '14 at 13:25
2 Answers
1
Use spark-rowsimilarity job in Mahout v1. Create a file of
user-ID<tab>book-ID1<space>book-ID2<space>etc...
In other words each row is a user's history of books read. First column is the user-ID, second column is a space delimited list of book-IDs. Run "mahout spark-rowsimilarity" and you'll get back files of the form:
user-ID<tab>user-ID5:strength<space>user-ID6:strength<space>etc...
This is a list of similar users for each user. The list is sorted and the strength is the LLR (log-likelihood ratio) score for how similar the users are.
Docs here: http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html

pferrel
- 5,673
- 5
- 30
- 41
-
-
1Sorry, but I didn't have the time to check your answer more in detail before now. It seemed that you answered my question perfectly, but, actually, I looked at the documentation and I didn't understand how to use spark. Could you please give me some hints on how to use that in Java? Thanks a lot – paskun Nov 24 '14 at 17:09
-
1The job mentioned above doesn't require programming, it is run from the command line as ```mahout spark-rowsimilarity```. That will give you a list of command line options. If you build Mahout you can run it locally without creating a Spark cluster by specifying ```mahout spark-rowsimilarity -ma local[4]``` or however many cores you want to give Spark. Creating a Spark cluster on top of Hadoop (which Spark uses for its distributed file system) is best treated as a separate question. Some tutorials are here: https://spark.apache.org/documentation.html – pferrel Nov 25 '14 at 17:38
-
I want to use mahout spark-itemsimilarity command line tool... but unable to find any proper documentation on how to setup and use it... Please help – Jayant Apr 12 '16 at 11:21
1
In Java you could do it like this:
org.apache.mahout.cf.taste.model.DataModel dataModel;
...
UserSimilarity similarity = new PearsonCorrelationSimilarity(dataModel);
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.75, similarity, dataModel);
UserBasedRecommender userBasedRecommender = new GenericUserBasedRecommender(dataModel, neighborhood, similarity);
long[] mostSimilarUserIDs = userBasedRecommender.mostSimilarUserIDs(...);

cnmuc
- 6,025
- 2
- 24
- 29