I'm currently using W3C Web Speech API for Spanish and Mandarin. Overall the recognition is okay, but there are many errors (especially with single words), and sometimes transcribed Spanish words arbitrarily add accents, e.g., lo siento
==> lo síento
.
I'm thinking of switching to a more robust and accurate API and found Google Speech API. While Web Speech API is free, I'd prefer to pay money for accuracy (lower error rates). In general, I do not a requirement for transcribing long audio files (6-8 word sentences usually max, but most often 1-4 word sentences) and intend to make these calls from the browser.
I cannot find documentation on the performances of these two APIs, so any help in making this decision to switch would be helpful.