0

My gradle crossfold task looks like this:

task crossfold(type: Crossfold, group: 'evaluate') {

    input 'data/mt-500k.yml'
    // test on same 1/5 of each user's ratings
    holdoutFraction(0.2,'timestamp')
    // use 5-fold cross-validation
    partitionCount 3
    //use partition users method
    method 'partition-users'
}

The mt-500k dataset contains all the ratings. Because of my limited amount of RAM, I need to run my algorithms separated. This means that even though my data doesn't change, the crossfolds get re-iterated resulting in different users in training/test folds causing the results to be incomparable. How would it be possible to maintain the same crossfold or prevent lenskit from re-iterating?

Diederik
  • 79
  • 1
  • 12

1 Answers1

0

Turns out that it's as simple as changing the dataSet variable in the evaluate task to:

dataSet 'build/crossfold.out/datasets.yaml'

Lenskit takes care of everything else.

Diederik
  • 79
  • 1
  • 12