0

I'm trying to plot a precision recall curve for binary text classification using ktrain (wrapper for BERT) and getting the following error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-142-9069d96b6d6b> in <module>
----> 1 disp = plot_precision_recall_curve(predictor, df_testing.description.values, df_testing.pred_class)
      2 disp.ax_.set_title('2-class Precision-Recall curve: '
      3                    'AP={0:0.2f}'.format(average_precision))

C:\conda\lib\site-packages\sklearn\metrics\_plot\precision_recall_curve.py in plot_precision_recall_curve(estimator, X, y, sample_weight, response_method, name, ax, **kwargs)
    145         estimator.__class__.__name__))
    146     if not is_classifier(estimator):
--> 147         raise ValueError(classification_error)
    148 
    149     prediction_method = _check_classifer_response_method(estimator,

ValueError: TextPredictor should be a binary classifier

My predictor is <ktrain.text.predictor.TextPredictor at 0x2d9361e61c8>. Is there a way in which I can convert my predictor to a binary classifier?

wovano
  • 4,543
  • 5
  • 22
  • 49
nikviz
  • 37
  • 6
  • It would be useful to know what your predictor is returning? Is it probabilities for more than two classes? Or is `plot_precision_recall_curve` expecting true binary predictions, like `[0, 1]`? Then you probably have log probabilities (something like [-2.3, -0.3]) or softmaxed log probabilities (like [0.2 , 0.8]). Some more insights or code would help. Greetings Patrick – Haller Patrick Sep 08 '21 at 13:03
  • lets say I post my text that i want to classify as value "test". Then this is how my output looks like: print(predictor.predict_proba(test)) print(predictor.predict(test)) [0.7587824 0.24121764] CLASSA Only Two classes are being considered for the model. CLASSA and CLASSB. Yes, I think, plot_precision_recall_curve expecting true binary predictions. Not sure how to convert that. – nikviz Sep 08 '21 at 13:16
  • Depends in what format your prediction is returned. If it is numpy array you can use something this like [numpy.around](https://numpy.org/doc/stable/reference/generated/numpy.around.html) or with tf tensors [tf.math.round](https://www.tensorflow.org/api_docs/python/tf/math/round) – Haller Patrick Sep 08 '21 at 13:24
  • 1
    Thank you for this. I got another package/fucntion skplt.metrics.plot_precision_recall_curve() It just takes in true class and prediction probabilities and I did not have to key in estimator/predictor. And the graph was plotted successfully. Thanks for your help as it helped me understand the function better. – nikviz Sep 08 '21 at 14:15
  • Please provide enough code so others can better understand or reproduce the problem. – Community Sep 13 '21 at 21:22

0 Answers0