0

I want to use eclipse to develop my project with mahout-0.9 and hadoop-2.2.0.

I could run my code with mahout-0.9 successfully. But I faced the problem how could I run my project with hadoop mod? I think I have to install hadoop in my computer, and use command to start it. Then I could run my project in eclipse with hadoop mod.

Since Mahout can use MAHOUT_LOCAL to determine the local mod or hadoop mod in linux. But when I set the environment variable MAHOUT_LOCAL to "", it also use local mod, why?

If it is impossible to run mahout with hadoop in eclipse, how could I run my project? Thanks:)

My sample code

package com.predictionmarketing.itemrecommend;

import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.UncenteredCosineSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;

public class ItemRecommend {

    public static void main(String[] args) {
        try {
             DataModel model = new FileDataModel(new File("data/test.txt")); 
             ItemSimilarity similarity = new UncenteredCosineSimilarity(model); 
             Recommender recommender = new GenericItemBasedRecommender(model, similarity);

             List<RecommendedItem> recommendations = recommender.recommend(2, 10);
             for(RecommendedItem recommendation : recommendations) {
                 System.out.println(recommendation.getItemID() + "," + recommendation.getValue());
             }
        } catch (IOException e) {
            System.out.println("There was an error.");
            e.printStackTrace();
        } catch (TasteException e) {
            System.out.println("There was a Taste Exception");
            e.printStackTrace();
        }
    }
}

enter image description here enter image description here

LoveTW
  • 3,746
  • 12
  • 42
  • 52
  • your code here doesn't involve any Hadoop class call. per example : 'new File("data/test.txt")' is reading from local file and not HDFS – eliasah Jun 30 '14 at 07:31
  • How could I read from HDFS?? Thanks – LoveTW Jun 30 '14 at 07:41
  • 1
    The answer here maybe be a little long, so I advice you to take a look at [this](https://github.com/fredang/mahout-naive-bayes-example/blob/master/src/main/java/com/chimpler/example/bayes/Classifier.java). – eliasah Jun 30 '14 at 08:34

1 Answers1

2

Your example is not Hadoop code. The Mahout recommenders come in non-hadoop "in-memory" versions, as you've used in your example, and Hadoop versions. The Hadoop version has a very different API since it calculates all recommendations for all users and puts these in HDFS files. You can run the Hadoop version from the command-line on a machine that is a Hadoop client (knows how to communicate with the Hadoop cluster). Access by typing mahout recommenditembased and it will print a help screen.

Once you have run the hadoop job on the cluster you will need to write code to lookup the recs for a specific user out of those files.

This is often done by writing code to store the recommendations in a database and using queries to retrieve the recs at runtime.

pferrel
  • 5,673
  • 5
  • 30
  • 41
  • Thank you for your suggestion! It is very helpful! Could you tell how to get the files stored in hdfs? I think it may be an text file stored in my computer, if I want to get the result, how could I do that? Thanks! – LoveTW Jul 01 '14 at 02:10
  • They will be in HDFS SequenceFiles. The "key" per row will be the mahout user ID (an integer), the value will be RecommendedItems as I recall so the value is a list of recommended items with weights. Run ```mahout seqdumper -i one-of-the-part-files | more``` to get a glimpse of the data and what the classname of Key and Value are. – pferrel Jul 01 '14 at 20:53