I would like to create a custom scorer in SciKit-Learn that I can pass to GridSearchCV, which evaluates model performance based upon the accuracy of predictions for a particular class.
Suppose that my training data consists of data-points belonging to one of three classes:
'dog', 'cat', 'mouse'
# Create a classifier:
clf = ensemble.RandomForestClassifier()
# Set up some parameters to explore:
param_dist = {
'n_estimators':[500, 1000, 2000, 4000],
"criterion": ["gini", "entropy"],
'bootstrap':[True, False]
}
# Construct grid search
search = GridSearchCV(clf,\
param_grid=param_dist,\
cv=StratifiedKFold(y, n_folds=10),\
scoring=my_scoring_function)
# Perform search
X = training_data
y = ground_truths
search.fit(X, y)
Is there a way to construct my_scoring_function, such that only the accuracy of predictions for the 'dog' class is returned? The make_scorer function seems to be limited in that it only deals with the ground truth and the predicted class for each data-point.
Many thanks in advance of your help!