A lot of speech to text services (such as Google's) provide a confidence score. At least for Google it is between 0 and 1, but is clearly not the probability that a particular transcription is correct, as confidences for alternative transcriptions add up to more than 1. Also a higher-confidence result is sometimes ranked lower.
So, what is it? Is there a recognized meaning of 'confidence score' in the speech recognition community? I have seen references to minimum Bayes risk but even if that is what they are doing, this doesn't much answer the question since that depends on a choice of auxiliary loss function.