1

WEKA Cross Validation:

 Classifier cls = new J48();
 Evaluation eval = new Evaluation(data);
 Random rand = new Random(1);  // using seed = 1
 int folds = 10;
 eval.crossValidateModel(cls, data, folds, rand);
 System.out.println(eval.toSummaryString());

What does it mean "rand"? How does cross validation in this case? 10 folds are always mixed?

Thank you!

vubo
  • 45
  • 5

1 Answers1

2

What does it mean "rand"?

Rand is an object instance that will randomize the dataset for you. This is used for cross validation purposes. The seed is a component of the randomness.

How does cross validation in this case?

The data set is mixed so that for example if you had data rows (1-100) in order, the data would be randomized so the first 5 might be (77,12,4,7,55) instead of (1,2,3,4,5)

10 folds are always mixed?

It depends on the tools or libraries you use but I don't think so with WEKA. I think it is just taking 1-10 and makes it a set 11-20 and make that a set and so on. This causes bias especially if the data grouped together in a file has similar characteristics. That is why data is best randomized.

applecrusher
  • 5,508
  • 5
  • 39
  • 89
  • Thank you for this helpful answer, how can I know how weka implements cross validation? (I'm using weka 3.8) and if I set Random(0) does it means that I eliminated the randomness? – F 505 Mar 20 '18 at 07:47