0

I read this website : http://lenskit.org/documentation/evaluator/quickstart/ I first tried to run it using the script " $ lenskit eval " and I just created a new groovy file in my hello-lenskit example and run it using the command line but nothing happened. Then I tried to use it in Java program(hello-lenskit.java). I run into some errors.

    File dataFile = new File("ml-100k/u.data");
    PreferenceDomain domain = new PreferenceDomain(1.0,5.0,1.0);
    DataSource data = new CSVDataSource("ml-100k",dataFile,"\t",domain);//give me an error CSVDataSource is not public and can not be accessed from the outside package.
    CrossfoldTask cross = new CrossfoldTask();

    LenskitConfiguration config1 = new LenskitConfiguration();
    config1.bind(ItemScorer.class)
            .to(UserMeanItemScorer.class);
    AlgorithmInstance alg1 = new AlgorithmInstance("PersMean",config1);
    evl.addAlgorithm(alg1);

    LenskitConfiguration config2 = new LenskitConfiguration();
    config2.bind(ItemScorer.class)
            .to(ItemItemScorer.class);
    config2.bind(UserVectorNormalizer.class)
            .to(BaselineSubtractingUserVectorNormalizer.class);
    config2.within(UserVectorNormalizer.class)
            .bind(BaselineScorer.class,ItemScorer.class)
            .to(ItemMeanRatingItemScorer.class);
    AlgorithmInstance alg2 = new AlgorithmInstance("ItemItem",config2);
    evl.addAlgorithm(alg2);

    evl.addMetric(RMSEPredictMetric.class);
    File file = new File("eval-results.csv");
    evl.setOutput(file);

What should I do next? How could I generate the overall rating error?

Pyrology
  • 169
  • 2
  • 12
user3369592
  • 1,367
  • 5
  • 21
  • 43
  • _'nothing happened'_ and _'run into some errors'_ are not an adequate problem description. –  Mar 13 '15 at 01:55

1 Answers1

1

Using the LensKit evaluation commands manually is difficult, undocumented, and not recommended.

The SimpleEvaluator is the best way to get overall accuracy from a LensKit recommender in a Java application.

For further assistance in debugging LensKit runs, I recommend e-mailing the mailing list with exactly the commands you are running and the output or errors you are getting.

Michael Ekstrand
  • 28,379
  • 9
  • 61
  • 93
  • I want to try with my own dataset now. But my data format is different from the movie 100 example. Each line of my dataset has three values UserID, BookISBN,Rating ( no timestamp) and they are separated by ";". For example, 234456; ISBN123;8 (this is one line). What changes I have to make in order for the lenskit to run my own dataset. This is what I have right now : EventDAO dao = new SimpleFileRatingDAO(inputFile, ";"); btw my input file is the csv file. – user3369592 Mar 16 '15 at 21:19
  • Also, some users from my dataset have less than 20 ratings. – user3369592 Mar 16 '15 at 21:26
  • @user3369592 The ';' is easy to handle; just change your delimiter to ';' (the simple evaluator supports configuring delimiters). If your ISBNs are pure numbers (no -'s or other formatting), no other changes are needed. If they have non-numeric characters, you will need to pre-process the data file to only have numeric item IDs. – Michael Ekstrand Mar 17 '15 at 15:49
  • @ Michael Ekstrand What exactly is the preferenceDomain : PreferenceDomain domain = new PreferenceDomain(1.0,5.0,1.0); When I changed 5.0 to 20.0, I can get a different result. – user3369592 Mar 21 '15 at 02:07
  • @user3369592 The preference domain is the valid range of ratings. [1.0,5.0]/1.0 is full stars from 1 to 5 stars. – Michael Ekstrand Mar 30 '15 at 20:31
  • @ Michael Ekstrand thank you for your reply. The algorithms (Persmean and FunkSVD) we use in the Hello-Lenskit example are item-based or user-based CF? What similarity measures are used for those two algorithms? – user3369592 Mar 30 '15 at 22:55
  • @user3369592 The `PersMean` algorithm is a personalized mean (user + item bias), and `FunkSVD` is Simon Funk's SVD collaborative filter. Neither is item- or user-based CF. – Michael Ekstrand Apr 02 '15 at 19:28