1

I am trying to convert output of CMU Sphinx's recognizer (i.e. list < hypothesis (i.e. phrase), score (in log) > obtained by tweaking test_ps_nbest.c) to following form: list < hypothesis (i.e. phrase), "probability" (between 0 and 1) >

A trivial method which I am using now is as follows:

  1. Divide each confidence score by language weight (eg: 11)
  2. Normalize the list of confidence score in log domain
  3. Output probability = exp(normalized confidence score)

The problem is that the output probability from above method is biased. Do you have any suggestions that I can use to get the bias in the probability ?

Example method that I have to implement to correct the bias:

vector < double > getBias(vector < string > phrases, vector < double > logConfidenceScores)

Example input for above discussion:

< "HE GOT IN OUR HEAD HEART LUNG AND HE MARKED IT", -43278 >

< "HE GOT IN OUR AT OUR CLASSES MONEY AND HE MARKED IT", -43449 >

< HE GOT IN POWER AT HEART LUNG AND HE MARKED IT", -43368 >

Niketan
  • 156
  • 2
  • 8

1 Answers1

1
A trivial method which I am using now is as follows:
Divide each confidence score by language weight (eg: 11)

First of all it's not a confidence score but a score. Why do you divide? The score in the list is acoustic score too, language weight doesn't have any sense here

Normalize the list of confidence score in log domain

This is also a senseless thing because of huge probability mass you do not account for.

Output probability = exp(normalized confidence score)

The sequence of actions does not have any mathematical sense, not strange you didn't get a good result.

If you want a per-utterance confidence score you might want to review a theory first:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.6890&rep=rep1&type=pdf

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87