5

I have set up a synchronous recognition script in Python that is working as expected to return the transcript of various audio files that I send to the Google Speech API. However, I can't seem to get the speech context hints (speech_contexts in Python; "phrase hints / speechContext" in Google documentation) to do anything useful. I have an audio file where the speaker is clearly saying the word "health", but it is getting transcribed as "house" every time, and despite explicitly telling it to look for the word "health" as seen in the code below, it does not find it. Any advice for getting this feature to be effective?

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    language_code='en-US',
    speech_contexts=[speech.types.SpeechContext(
        phrases=['health'])]
    )

Thanks!

justishar
  • 73
  • 7
  • Try to add full phrase, not just a word. Or at least a sequence of several words around. – Nikolay Shmyrev May 11 '18 at 16:33
  • Thanks, I did try that and eventually got it to match the word, however I had to match three additional preceding words from that audio file in order to do so. Unfortunately, I won't have the luxury of looking for a phrase this long in production -- This will only be useful to me if I can force it to always choose, in this example, the word "health" if there is a reasonable chance that the word could be "health". – justishar May 14 '18 at 15:21
  • This is the way it works. Usually if the word is not recognized, the proper way to fix it is to use the correct acoustic model, not try to fix it with hints. – Nikolay Shmyrev May 15 '18 at 07:30

0 Answers0