0

How does SageMaker Clarify bias detection work for features that are continuous?

Does it bin continuous variables automatically or do users need to bin them themselves before running the bias detection job?

Using the fairness and explainability example, I selected the 'Capital Gain' facet (it has values 0-99999, no nulls), and set the facet_values_or_threshold=[5000] (expecting the split to occur on 5000).

bias_config = clarify.BiasConfig(label_values_or_threshold=[0],
                            facet_name='Capital Gain',
                            facet_values_or_threshold=[5000]
                            )

The result was: "error": "CI: facet set is empty. Check that x[facet] has non-zero length."I assume this is due to the fact that 'Capital Gain' doesn't have the exact value of 5000.I tested with facet_values_ or_threshold=[2174]

bias_config = clarify.BiasConfig(label_values_or_threshold=[0],
                            facet_name='Capital Gain',
                            facet_values_or_threshold=[2174]
                            )

and got a result:

{
"version": "1.0",
"pre_training_bias_metrics": {
    "label": "Target",
    "facets": {
        "Capital Gain": [
            {
                "value_or_threshold": "2174",
                "metrics": [
                    {
                        "name": "CI",
                        "description": "Class Imbalance (CI)",
                        "value": 0.9969498043896293
                    }
                ]
            }
        ]
    },
    "label_value_or_threshold": "0"
}

}

juvchan
  • 6,113
  • 2
  • 22
  • 35

0 Answers0