0

I want to obtain some info. during classification of some test instances. I'm using a MultiClassClassification model with the option SMO to predict classes. I found some code here but this only displays some basic info (id, starting class and predicted class). That's really cool, but I want that cherry on top.

Here is the code:

double classLabel = cModel.classifyInstance(testInstances.instance(i));
System.out.print("ID: " + testInstances.instance(i).value(0));
System.out.print(", actual: " + testInstances.classAttribute().value((int) testInstances.instance(i).classValue()));
System.out.println(", predicted: " + testInstances.classAttribute().value((int) classLabel));
labeled.instance(i).setClassValue(classLabel);

Here is one output example of what shows up in the console (everything works well, classification works well):

ID: 10.840449559881472, actual: class_1, predicted: class_12

I would like to add a probability value to the output that would show something between 0 and 1 for the predicted class (like 0.80... for example). How can I achieve that?

I've tried this: double[] p = cModel.individualPredictions(testInstances.instance(i)); but this returns numbers I really can't make sense of.

Example of an output:

7.664525149317826E-177

EDIT:

Ok. Now I've used the distributionForInstance method and it actually returned some real numbers (used it before and it gave me those strange ones), but predictions for some cases are really low all though they are correctly classified. Might need to add more samples to my classifier but at least it gives results now.

This piece of code shows results (for future references):

double[] p = cModel.distributionForInstance(testInstances.instance(i));

Prediction examples of some correctly classified unknown samples:

0.6801721826680843 -- example 1 class 12

0.9834993119977282 -- example 2 class 14

0.20165539938974703 -- example 3 class 1

0.9947991411834111 -- example 4 class 9

0.9809472418105786 -- example 5 class 3

Will probably stick with this solution as it is the most reasonable one I've found so far.

Thanks again.

Community
  • 1
  • 1
c00ki3s
  • 466
  • 9
  • 19
  • double[] p = cModel.individualPredictions(testInstances.instance(i)) Here the p is contains the probabilities that the instance is belongs to the actual class. like when you got p[0] = 7.664525149317826E-177 means that the probability that the actual instance is belongs to the class 0, is very low. But, you can find other class inside p, where the probability is higher. – Istvan Nagy Mar 07 '16 at 20:55
  • Thanks Istvan, for the explanation. It's really hard for me to interpret numbers like that, since I'm more aware of probabilities between 0 and 1. These outputs are really out of my league. :) – c00ki3s Mar 07 '16 at 21:14
  • 1
    it may helpful for you: http://stackoverflow.com/questions/11960580/weka-classification-likelihood-of-the-classes – Istvan Nagy Mar 08 '16 at 14:06
  • 1
    or this: http://stackoverflow.com/questions/20605615/how-to-get-predication-value-for-an-instance-in-weka – Istvan Nagy Mar 08 '16 at 14:16
  • Yes, I've used the `distributionForInstance` method. This is giving some results that are much more readable then `individualPredictions`. Now I'm looking for a method to filter my model (exclude some classes from classification) but this is a whole new topic. Thanks again Istvan. – c00ki3s Mar 08 '16 at 15:04

2 Answers2

0

How would you address the challange when you also need to predicte the "no class" category? Predicting the "no class" / unrecognised class in Weka Machine Learning

Community
  • 1
  • 1
Marc Giombetti
  • 819
  • 1
  • 8
  • 20
  • Weka predicts nearest classes to your unknown sample. The nearest one or the most similar one is predicted. This means if you want to predict a "no class" class you would have to include valid data for a "no class" class prediction, which I think is absurd. It might be a better way to calculate prediction errors or prediction confidence. If the error is to high or confidence to low, set the class value of your instance sample to "no class". You may want to check out a discussion here: http://forums.pentaho.com/showthread.php?72003-Prediction-confidence-in-Weka – c00ki3s Mar 08 '16 at 16:14
  • And another discussion on stack: http://stackoverflow.com/questions/11084248/weka-prediction-percentage-confidence-what-does-it-mean and another one here: http://stackoverflow.com/questions/21902473/how-to-calculate-confidence-from-weka-api – c00ki3s Mar 08 '16 at 16:19
0

So, I will take the liberty and answer my own question here since I would like to help the SO close it and move on.

The classifyInstance()method classified an instance to a class and didn't provide the necessary data I was looking for.

I've tried double[] p = cModel.individualPredictions(testInstances.instance(i)); and double[] p = cModel.distributionForInstance(testInstances.instance(i)); which both returned results.

I held on to the cModel.distributionForInstance(testInstances.instance(i)); method, since I was in need to exclude some classes manually from my results. Ignoring unwanted distributions from a MultiClassClassifier was the only possible solution at the moment. This temporarily solves my filtering problem for classes that are too distant from my desired determination.

Here is the related post: WEKA - filtering out classes in a MultiClassClassifier

Thanks again.

Community
  • 1
  • 1
c00ki3s
  • 466
  • 9
  • 19