1

I've built a voice recognition system with Angular/websockets/node js, and google speech to text api.

I works very well on almost all words, but it has real issues with the word "no" - it seems almost as though the word "no" doesn't even get passed to the API, as no interim results happen. This issue doesn't occur for words such as "yes", or longer words, or even numbers - 1, 2, 3 etc.

IE th .on('data', (data) => { of streamingRecognise outputs nothing - seemingly until it "hears" a word like "yes", "hello", etc, but not "no" without a lot of umph.

Any ideas?

Config:

  sampleRateHertz = 48000;                
  languageCode = 'en-US';                 
  single_utterance = true;                // Processes after short sound burst (sentence/word)
  interimResults = true;                  // Reports back findings mid-sentence. Useful for "processing" UI
  metadata = {
    microphoneDistance: 'NEARFIELD',       
    interactionType: 'VOICE_SEARCH',      
    recordingDeviceType: 'PC',             
  };```
bionara
  • 228
  • 1
  • 10

1 Answers1

0

I could suggest some ideas:

  • Did you check if when you are trying to recognize the word no some END_OF_SINGLE_UTTERANCE event appear? This would be useful to know where is issue happening since you are using single_utterance.
  • You could try to use this to increase the probability that Speech-to-Text recognizes the word no. IE:

  "config": {
    "sampleRateHertz": 8000,
    "languageCode":"en-US",
    "single_utterance":true,                
    "interimResults":true,                
    "metadata" : {
      "microphoneDistance": 'NEARFIELD',       
      "interactionType": 'VOICE_SEARCH',      
      "recordingDeviceType": 'PC',             
    };
    "speechContexts": [{
      "phrases": ["no"]
    }]
  }

davidmesalpz
  • 133
  • 6