Word error rates for Google Cloud Speech API vs Web Speech API

Question

I'm currently using W3C Web Speech API for Spanish and Mandarin. Overall the recognition is okay, but there are many errors (especially with single words), and sometimes transcribed Spanish words arbitrarily add accents, e.g., lo siento ==> lo síento.

I'm thinking of switching to a more robust and accurate API and found Google Speech API. While Web Speech API is free, I'd prefer to pay money for accuracy (lower error rates). In general, I do not a requirement for transcribing long audio files (6-8 word sentences usually max, but most often 1-4 word sentences) and intend to make these calls from the browser.

I cannot find documentation on the performances of these two APIs, so any help in making this decision to switch would be helpful.

score 1 · Accepted Answer · answered Sep 11 '19 at 08:03

1

Google speech api is not perfect either, you can get most accuracy from specialized solution.

Calling directly from the browser is not really an option for Google Speech API since you have to expose your API key in the browser, that is a bad idea, you'll have to maintain a server infrastructure anyway.

answered Sep 11 '19 at 08:03

Nikolay Shmyrev

24,897
5
43
87

Is web speech api my best option then for browser foreign language speech recognition? – user3871 Sep 11 '19 at 17:35
If you don't want to have a server, then yes. It depends on application details a lot. – Nikolay Shmyrev Sep 11 '19 at 19:55
Well I have a PHP API already so if the Google Cloud speech API has better recognition than Web Speech API, I'll make the switch – user3871 Sep 11 '19 at 20:51

Word error rates for Google Cloud Speech API vs Web Speech API

1 Answers1