I'm working with the Mahout framework in order to get recommendations in implicit feedback context using the well-known movielens dataset (ml-100k) that I have binarized considering 1 all the ratings equal to four or five, zero all the other. In this dataset there are five split, each of which divided in test set and training set as usually.
In the recommendation process I train the recommender using a simple GenericBooleanPrefUserBasedRecommender and the TanimotoCoefficientSimilarity as described in these lines of code:
DataModel trainModel = new FileDataModel(new File(String.valueOf(Main.class.getResource("/binarized/u1.base").getFile())));
DataModel testModel = new FileDataModel(new File(String.valueOf(Main.class.getResource("/binarized/u1.test").getFile())));
UserSimilarity similarity = new TanimotoCoefficientSimilarity(trainModel);
UserNeighborhood neighborhood = new NearestNUserNeighborhood(35, similarity, trainModel);
GenericBooleanPrefUserBasedRecommender userBased = new GenericBooleanPrefUserBasedRecommender(trainModel, neighborhood, similarity);
long firstUser = testModel.getUserIDs().nextLong(); // get the first user
// try to recommender items for the first user
for(LongPrimitiveIterator iterItem = testModel.getItemIDsFromUser(firstUser).iterator(); iterItem.hasNext(); ) {
long currItem = iterItem.nextLong();
// estimates preference for the current item for the first user
System.out.println("Estimated preference for item " + currItem + " is " + userBased.estimatePreference(firstUser, currItem));
}
When I execute this code, the result is a list of 0.0 or 1.0 which are not useful in the context of top-n recommendation in implicit feedback context. Simply because I have to obtain, for each item, an estimated rate which stays in the range [0, 1] in order to rank the list in decreasing order and construct the top-n recommendation appropriately.
So what's the problem with this code? Have I missed something or something was incorrect? Or maybe is the Mahout framework that doesn't provide a proper way of using binary feedback?
Thank you in advance,
Alessandro Suglia