4

I'm trying to make an alarm clock Android app that could be stopped with voice recognition. For that, I'm using the Google Speech Recognition API (+ this code to do voice recognition continuously).

It works fine, until I play music at the same time. The voice recognition becomes way less efficient in this case.

This problem is logical, since the music adds some noise which makes recognition harder. But since the music played is known, I was wondering if it was possible to tell Google to try to ignore these additional noise. I know there exists some filter in signal processing to do that (like Kalman filter or Wiener filter).

So my question is: Is it possible to apply a filter with Google voice recognition to ignore a known noise? Or is there another voice recognition library that allows that?

Edit: It's not a duplicate, since the problem is not the same. But interesting suggestion though.

Pika Supports Ukraine
  • 3,612
  • 10
  • 26
  • 42
  • Possible duplicate of [Keyword Spotting in Speech on Android?](https://stackoverflow.com/questions/9533808/keyword-spotting-in-speech-on-android) – Nikolay Shmyrev Jan 27 '18 at 15:48

1 Answers1

2

Google Voice Recognition will already be optimised to detect speech, regardless of any background ambient noise 'type'.

Rather than using Google's native Voice Recognition, supplied via their 'Now/Assistant' application, you can use their Cloud Speech API which offers some enhancements.

The recognizer is designed to ignore background voices and noise without additional noise-canceling. However, for optimal results, position the microphone as close to the user as possible, particularly when background noise is present.

The above is no doubt true generally across their Voice Recognition System.

Use word and phrase hints to add names and terms to the vocabulary and to boost the accuracy for specific words and phrases.

For short queries or commands, use StreamingRecognize with single_utterance set to true. This optimizes the recognition for short utterances and also minimizes latency.

https://cloud.google.com/speech/docs/best-practices

brandall
  • 6,094
  • 4
  • 49
  • 103
  • Thanks for your suggestion. I tried Cloud API, it's almost perfect, the only problem is that it's not free. Are there free alternatives to Cloud API? – Timothé Malahieude Feb 01 '18 at 09:53
  • I would accept offline solutions that run on the phone, or something that I can deploy to my own server. – Timothé Malahieude Feb 01 '18 at 09:53
  • @TimothéMalahieude I only know of Sphinx. The links to which are in the 'duplicate question' thread. – brandall Feb 01 '18 at 10:58
  • As per the best practices, it says "It's best to provide audio that is as clean as possible by using a good quality and well-positioned microphone". So I believe you need to provide an audio file with as clear a speech as possible, with as little background noise as possible. – talon Jun 14 '19 at 07:44