-2

I understand that dummy classifier https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html gives a measure of baseline performance and it applies different strategies to predict.

Have found examples of its application in binary class problems. Wanted to understand can this classifier be used in a multiclass scenario. If yes, how would stratified strategy work?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Deb
  • 499
  • 2
  • 15

1 Answers1

2

DummyClassifier indeed supports multiclass classification. Here is a small example:

from sklearn.datasets import make_classification
from sklearn.dummy import DummyClassifier


X, y = make_classification(n_classes=3, n_clusters_per_class=1, random_state=42)

clf = DummyClassifier(strategy='stratified')
clf.fit(X, y)

It even supports multiclass-multioutput classification as its fit method accepts y of the following shape:

y: array-like of shape (n_samples,) or (n_samples, n_outputs)


To understand the stratified strategy, you might want to consult the user guide:

stratified generates random predictions by respecting the training set class distribution

So the predictions will be random but still in accordance with the distribution of classes in the training set, as you can see here:

print(np.unique(y, return_counts=True))
# output: (array([0, 1, 2]), array([34, 33, 33])) 

print(np.unique(clf.predict(X), return_counts=True))
# output: (array([0, 1, 2]), array([32, 38, 30]))
afsharov
  • 4,774
  • 2
  • 10
  • 27