Fairness metrics for multi-class classification

Question

Are there any metrics implemented in Fairlearn or any published papers that I can refer to for use-cases around fairness measurement of multi-class classification where the metrics are AP and not accuracy? Thanks!

Roman Lutz · Answer 1 · 2021-06-30T17:05:58.513

Update: The Fairlearn documentation now has a FAQ section on this topic https://fairlearn.org/main/faq.html Search for "Does Fairlearn support multi-class classification?"

Previous answer: Fairlearn's metrics are designed for binary classification or regression. You could evaluate the various labels individually, of course. If you have a specific idea of what you'd like to see please open a new feature request.

Fairlearn does support a variety of metrics, not just accuracy. The user guide has a full list: https://fairlearn.org/v0.6.0/user_guide/assessment.html#scalar-results-from-metricframe

One example that comes to mind for a paper doing multi-class classification while thinking about fairness is CheXclusion by Seyyed-Kalantari et al. They mostly look into TPR differences when classifying chest x-rays.

The Fairlearn community would definitely be interested in hearing about your use case. Perhaps there's some way we can help. Feel free to reach out via Gitter or by creating your feature request (as mentioned above).

Fairness metrics for multi-class classification

1 Answers1