4

Once a 10-fold cross-validation is done with a classifier, how can I print out the prediced class of every instance and the distribution of these instances?

J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
eval.crossValidateModel(j48, newData, 10, new Random(1));

When I tried something similar to below, it said that the classifier is not built.

for (int i=0; i<data.numInstances(); i++){
   System.out.println(j48.distributionForInstance(newData.instance(i)));
 }

What I'm trying to do is the same function as in the WEKA GUI wherein once a classifier is trained, I can click on Visualize classifier error" > Save, and I will find the predicted class in the file. But now I need it in to work in my own Java code.


I have tried something like below:

J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
StringBuffer forPredictionsPrinting = new StringBuffer();
weka.core.Range attsToOutput = null;
Boolean outputDistribution = new Boolean(true);
eval.crossValidateModel(j48, newData, 10, new Random(1), forPredictionsPrinting, attsToOutput, outputDistribution);

Yet it prompts me the error:

Exception in thread "main" java.lang.ClassCastException: java.lang.StringBuffer cannot be cast to weka.classifiers.evaluation.output.prediction.AbstractOutput
Kevin
  • 2,191
  • 9
  • 35
  • 49

3 Answers3

3

The crossValidateModel() method can take a forPredictionsPrinting varargs parameter that is a weka.classifiers.evaluation.output.prediction.AbstractOutput instance.

The important part of that is a StringBuffer to hold a string representation of all the predictions. The following code is in untested JRuby, but you should be able to convert it for your needs.

j48 = j48.new
eval = Evalution.new(newData)
predictions = java.lange.StringBuffer.new
eval.crossValidateModel(j48, newData, 10, Random.new(1), predictions, Range.new('1'), true)
# variable predictions now hold a string of all the individual predictions
michaeltwofish
  • 4,096
  • 3
  • 28
  • 32
  • 1
    But in http://weka.sourceforge.net/doc/ I don't see any options for crossValidateModel as your description, would you mind to point me to the correct documentation or somewhere that I can see this sort of information? Appreciate!! – Kevin Sep 06 '11 at 13:02
  • 1
    See http://weka.sourceforge.net/doc.dev/weka/classifiers/Evaluation.html#crossValidateModel. SO doesn't seem to be parsing the fragment identifier properly, so scroll down to the first crossValidateModel method signature. – michaeltwofish Sep 06 '11 at 23:17
  • 1
    please see my edit on the question. I tried something w/ ur suggestion... but it prompts me the error, not sure what I've done wrong. Please help!! Thanks!! – Kevin Sep 11 '11 at 14:28
  • 1
    I've not really used weka with Java, only JRuby, where the above code works. Try creating a [PlainText](http://weka.sourceforge.net/doc.dev/weka/classifiers/evaluation/output/prediction/PlainText.html) object, which extends AbstractOutput (called output for example) instance and calling `output.setBuffer(forPredictionsPrinting)` and passing that in instead of the buffer. – michaeltwofish Sep 12 '11 at 02:10
  • 1
    It works!!!!! Thanks you so much!! But with the PlainText instance or the StringBuffer instance, is there a way to get the result of the ith instance? I also notice that the returned prediction included "inst# actual predicted error prediction", yet the GUI one included " inst#, actual, predicted, error, probability distribution" (which included the distribution of all classes). May I know why they are different? And how can I get the one like the GUI? – Kevin Sep 14 '11 at 05:42
  • 1
    If you want the distribution as well, use `output.setOutputDistribution(true);` or you can set all the options by passing an array of options to `output.setOptions()`. I'm not sure about getting individual predictions. You might want to play with the [EvaluationUtils class](http://weka.sourceforge.net/doc/weka/classifiers/evaluation/EvaluationUtils.html). – michaeltwofish Sep 15 '11 at 00:56
  • 1
    would you please check my another post http://stackoverflow.com/questions/7573987/how-to-identify-each-row-of-the-evaluation-result-to-its-corresponding-instance-i to see if you know the answer? Thanks!! – Kevin Sep 27 '11 at 18:34
0

I was stuck some days ago. I wanted to to evaluate a Weka classifier in matlab using a matrix instead of loading from an arff file. I use http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface and the following source code. I hope this help someone else.

import weka.classifiers.*;

import java.util.*

wekaClassifier = javaObject('weka.classifiers.trees.J48');

wekaClassifier.buildClassifier(processed);%Loaded from loadARFF

e = javaObject('weka.classifiers.Evaluation',processed);%Loaded from loadARFF
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testing,10,myrand,array)
e.toClassDetailsString

Asdrúbal López-Chau

0
clc
clear
%Load from disk
fileDataset = 'cm1.arff';
myPath = 'C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\AvTh\MyPapers\Papers2014\UnderOverSampling\data\Skewed\datasetsKeel\';
javaaddpath('C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\JarsForExperiments\weka.jar');
wekaOBJ = loadARFF([myPath fileDataset]);
%Transform from data into Matlab
[data, featureNames, targetNDX, stringVals, relationName] = ... 
weka2matlab(wekaOBJ,'[]');
%Create testing and training sets in matlab format (this can be improved)
[tam, dim] = size(data);
idx = randperm(tam);
testIdx = idx(1 : tam*0.3);
trainIdx = idx(tam*0.3 + 1:end);
trainSet = data(trainIdx,:);
testSet = data(testIdx,:);
%Trasnform the training and the testing sets into the Weka format
testingWeka = matlab2weka('testing', featureNames, testSet);
trainingWeka = matlab2weka('training', featureNames, trainSet);
%Now evaluate classifier
import weka.classifiers.*;
import java.util.*
wekaClassifier = javaObject('weka.classifiers.trees.J48');
wekaClassifier.buildClassifier(trainingWeka);
e = javaObject('weka.classifiers.Evaluation',trainingWeka);
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testingWeka,10,myrand,array)%U
e.toClassDetailsString