If I run a prediction against a model with multiple categories, are the scores split amongst all categories?

Question

Let's say I have created a model with ~30 items for each of 10 categories. I've taken all of the defaults that were provided to me.

The Average F1 Score for the model is 0.875 (I have 2 categories that are very closely related, so that's hurting accuracy a bit).

If I do a real-time prediction for a piece of text that should match positively for category 3 and 8, I get this result:

{
    "Prediction": {
        "details": {
            "Algorithm": "SGD",
            "PredictiveModelType": "MULTICLASS"
        },
        "predictedLabel": "8",
        "predictedScores": {
            "1": 0.002642059000208974,
            "2": 0.010648942552506924,
            "3": 0.41401588916778564,
            "4": 0.02918998710811138,
            "5": 0.008376320824027061,
            "6": 0.009010250680148602,
            "7": 0.006029266398400068,
            "8": 0.4628857374191284,
            "9": 0.04102163389325142,
            "10": 0.01617990992963314
        }
    }
}

What I'm wondering is whether 3 & 8 both had effectively an ~80% certainty, but because they both matched the certainty was split between the two. If you sum all the predictedScores, you get .999999997, which has me questioning whether there's a total 1.0 score that gets split amongst each of the available categories...

If I instead set up 10 different models, and did binary matches against each of them independently, would I see that 3 & 8 would score higher (e.g. something closer to 0.8)?

I guess a related question, that I don't really need answered but might help clarify the overall question, is ... If I had a theoretical piece of text that definitely fit all 10 categories, could Amazon Machine Learning respond with a predictedScore value of 1.0 for each category? Or, because the maximum predictedScore is 1.0, would it return 0.1 for each category?

score 2 · Accepted Answer · answered Dec 14 '16 at 23:29

Amazon ML returns probabilities for each category known from the input set. Because they are true modeled probabilities, they must sum up to 1. In other words, you are correct when you say "there's a total 1.0 score that gets split amongst each of the available categories..."

Here is a reference page that answers this and some of your other questions: http://docs.aws.amazon.com/machine-learning/latest/dg/reading-the-batchprediction-output-files.html#interpreting-the-contents-of-batch-prediction-files-for-a-multiclass-classification-ml-model

If I run a prediction against a model with multiple categories, are the scores split amongst all categories?

1 Answers1