5

my weka output shows:

Correctly Classified Instances       32083               94.0244 %
Incorrectly Classified Instances      2039                5.9756 %

I want to be able to print out what the incorrect instances were so i can make adjustments and understand why they were misclassified.

my print method is below.
i am attempting to find instances whose predicted class value were not equal to the actual class value and then print its attributes.
but when i do this the attribute enumeration is not printing anything.

Does anyone have a suggestion for how to print out the misclassified instances?

thanks much.

private void printSummary(Classifier base, Evaluation eval, Instances data) throws Exception
{
    // output evaluation
    System.out.println();
    System.out.println("=== Setup ===");
    System.out.println("Classifier: " + classifierName.getClass().getName() + " " + Utils.joinOptions(base.getOptions()));
    System.out.println("Dataset: " + data.relationName());
    System.out.println();

    // output predictions
    System.out.println("# - actual - predicted - error - distribution - token");
    for (int i = 0; i < data.numInstances(); i++) 
    {
        double pred = base.classifyInstance(data.instance(i));
        double actual = data.instance(i).classValue();
        double[] dist = base.distributionForInstance(data.instance(i));

        if (pred != actual)
        {
            System.out.print((i+1));
            System.out.print(" - ");
            System.out.print(data.instance(i).toString(data.classIndex()));
            System.out.print(" - ");
            System.out.print(data.classAttribute().value((int) pred));
            System.out.print(" - ");
            if (pred != data.instance(i).classValue())
                System.out.print("yes");
            else
                System.out.print("no");
            System.out.print(" - ");
            System.out.print(Utils.arrayToString(dist));
            System.out.print(" - ");
            data.instance(i).enumerateAttributes().toString();
            System.out.println();
        }
    }

    System.out.println(eval.toSummaryString());
    System.out.println(eval.toClassDetailsString());
    System.out.println(eval.toMatrixString());
}
britt
  • 83
  • 3
  • 5
  • i just realized that its empty because i forgot to wrap it in a print statement. – britt Oct 18 '11 at 21:27
  • unfortunately adding it to print statement or iterating over the values of the enum just shows me all possible values of all attributes. it doesnt actually tell me what the attrib values were for my misclassified instances. any help is greatly appreciated. – britt Oct 18 '11 at 21:28
  • I understand, "Weka output" is from Weka Explorer? If so, these results are for cross-validation, and you most probably make tests in some other way. Before diving into code, can you make a simple test - add `else` clause to your `if` and count how many instances are actually classified correctly (`pred == actual`)? – ffriend Oct 18 '11 at 21:45

2 Answers2

1

This is old post but I had the same problem and solved it differently. Maybe someone like me will need it.

What I did is that Evaluation have predictions method which returns ArrayList of prediction object.

Each Prediction object have actual and predicted and I simply printed each instance that his actual is not the same the predicted value.

My code:

ArrayList<Prediction> predictions = evaluation.predictions();
for (int i = 0, trainDataSize = trainData.size(); i < trainDataSize; i++) {
        Instance instance = trainData.get(i);
        Prediction prediction = predictions.get(i);

        if (prediction.actual() != prediction.predicted()) {

            System.out.println(instance);;

        }

    }

Hope it help someone.

danny11
  • 483
  • 1
  • 5
  • 12
1

I do this that way:

  1. Train classifier.
  2. For each instance I call 'classifier.explain'
  3. If classification is incorrect I store them by incorrect probability (from worst error to least confident error)
  4. Most confident error give me ideas what features should be added to classifier.
yura
  • 14,489
  • 21
  • 77
  • 126
  • I dont see classifier.explain method available. Your comment did help point out something for me though. Previously i was using eval object to print out summary data and the confusion matrix, but i was also calling classifier.classifyInstance. the important difference was that the eval data was based on the 10x cross validation, and the classify instance was attempting to classify the instance again using the trained classifier. – britt Oct 19 '11 at 15:23
  • i was able to get the printing i needed by just printing the instance like this: System.out.print(data.instance(i)); – britt Oct 19 '11 at 15:24
  • I'm sorrry in weka it is `double[] distributionForInstance(Instance instance) Predicts the class memberships for a given instance.` But it works good not for all classifiers, for bayes and trees it always return 1,0. It is good for LibSVM and Logistic and some other which supports prediction probabilities, – yura Oct 20 '11 at 09:02