1

I have a pipeline which takes in TFIDF vectorizer and GBM binary classifier and gives me the label and probability. In production, I dont want the label, I just want the probability of 1 coming out of the pipeline. Can I make changes to the pipeline to get just the probability of 1 instead of label and probability of 0 and 1.

gbm_pipeline = Pipeline([('tfidf', TfidfVectorizer(use_idf=True)),
    ('gbm',GradientBoostingClassifier(random_state = 23)),
                    ])

When I will use this pipeline to predict, it will give me out something like

predict [1]
predict_proba [{0: 0.47260814905166626, 1: 0.5273918509483337}]

whereas I just want it to be

0.5273918509483337

PS: I cannot make use of the Pipeline's output. I wish make the changes in the pipeline itself so that instead of getting label and probability, I just get probability of 1

Gaurab Das
  • 21
  • 2

1 Answers1

0

I would just run a for loop in such situations:

probab = []
a = [{0: 0.47260814905166626, 1: 0.5273918509483337}]

for x in a:
    probab.append(x.get(1))

probabilty is stored in probab:

print(probab)

[0.5273918509483337]

Adarsh Wase
  • 1,727
  • 3
  • 12
  • 26
  • The issue is I cannot play around with the pipeline's output. I need to do something in the pipeline itself so that it gives me only probability. – Gaurab Das Sep 08 '21 at 05:59