2

I'am using mahout-distribution-0.9. I have a problem in my program.

import java.io.File;
import java.util.List;

import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;


class RecommenderIntro {
    public static void main(String[] args) throws Exception {
            DataModel model =
            //new FileDataModel (new File("F:\\ml-10M100K\\intro.csv"));
            new FileDataModel (new File("F:\\ml-10M100K\\ratingsShort.dat"),"::");

            UserSimilarity similarity = new PearsonCorrelationSimilarity (model);
            UserNeighborhood neighborhood = new NearestNUserNeighborhood (2, similarity, model);
            Recommender recommender = new GenericUserBasedRecommender (model, neighborhood, similarity);
            List<RecommendedItem> recommendations = recommender.recommend(1, 2);
            for (RecommendedItem recommendation : recommendations) {
                    System.out.println(recommendation);
            }

    }
}

The content in File intro.csv is like:

1,101,5.0
1,102,3.0
1,103,2.5
2,101,2.0
2,102,2.5
2,103,5.0

When I use intro.csv to run this ,it has output in eclipse:

RecommendedItem[item:104, value:4.257081]
RecommendedItem[item:106, value:4.0]

The content in File ratingsShort.dat is like:

1::122::5::838985046
1::185::5::838983525
1::231::5::838983392
1::292::5::838983421
2::733::3::868244562
2::736::3::868244698

or change the content of ratingsShort.dat to :

1,539,5
1,589,5
2,110,5
2,151,3
2,733,3
2,802,2
2,1210,4
2,1544,3
3,1246,4
3,1408,3.5
3,1552,2
3,1564,4.5

When I use ratingsShort.dat,there is no output in eclipse.

FileDataModel(File dataFile, String delimiterRegex)

The method in Mahout support this usage,but why it has no output?

Can anybody who give me some advise? Thanks a lot!

Samt
  • 109
  • 8

2 Answers2

0

OK.I figure out my problem.I changed my movielens from ml-10m.zip to ml-1m.zip. It does have output.

So,This issue is because THE DATASET I intercept IS not appropriate!The intro.csv from Internet is

sufficient for mahout to caculate the recommend value but not my dataset that I cut as will.

Samt
  • 109
  • 8
0

You need to translate your IDs into Mahout IDs. Mahout treats user and items IDs as the row and column numbers of the rating. So the first ID for row/user will be "0", which corresponds to your id of "1", The same for column/item IDs. If your IDs were only the ones shown above they would need to be translated to Mahout ids as below:

0,2,5
0,3,5
1,0,5
1,1,3
1,4,3
1,5,2
1,6,4
1,10,3
2,7,4
2,8,3.5
2,9,2
2,11,4.5

It doesn't matter how you map row/user and column/item IDs to mahout IDs (i did it above by sort order but this is not required) but the Mahout IDs must be contiguous non-negative integers. Then when you get recommendations they must be translated back into your IDs.

pferrel
  • 5,673
  • 5
  • 30
  • 41