2

I am converting audio to text using sphinx, and I can't find how to access the confidence score for each word

I am able to access the transcription output, but I can't get the estimated probabilities behind the model. This feels basic, but I can't find the proper documentation. What should I add to the below?

test = sr.AudioFile(audio_file)
Recon = sr.Recognizer()

with test as source:
    test_audio = Recon.record(source)
text = Recon.recognize_sphinx(test_audio,language = 'en-US')```
mzjn
  • 48,958
  • 13
  • 128
  • 248
largesse
  • 75
  • 1
  • 5

1 Answers1

1

Confidence result is not returned by the current version of speech-recognition. If you look at the implementation:

def recognize_sphinx(...):
   ...
   # return results
   hypothesis = decoder.hyp()
   if hypothesis is not None: return hypothesis.hypstr
   raise UnknownValueError()  # no transcriptions available

you will see that only the text result (hypothesis.hypstr) is returned, while the confidence is in hypothesis.prob. A quick workaround would be to copy-paste the entire method after installing pocketsphinx alone:

pip install pocketsphinx

Alexander Solovets
  • 2,447
  • 15
  • 22