Google speech to text api (reading from GCS broken)

Question

As it says in the title, I can't get the s2t API to work with gcs.

When working with local <1m files, it works well, but when I provide it a gcs link (be it of the same short file or a longer one) I get bad results. I either get nothing, or just a very small portion of the file transcribed (~5 words out of a 2 minute speech).

Is there some gotcha that I'm not aware of, or is it a known bug that I just couldn't find on the internet? Here is the code used (google's own example in js):

  const speech = require('@google-cloud/speech');
  const client = new speech.SpeechClient();
  const gcsUri = `gs://myBucket/${toRec}`;
  const encoding = 'AMR';
  const sampleRateHertz = 8000;
  const languageCode = 'sr-RS';

  const config = {
    encoding: encoding,
    sampleRateHertz: sampleRateHertz,
    languageCode: languageCode,
  };

  const audio = {
    uri: gcsUri,
  };

  const request = {
    config: config,
    audio: audio,
  };

  const [operation] = await client.longRunningRecognize(request);
  const [response] = await operation.promise();
  const transcription = response.results
    .map(result => result.alternatives[0].transcript)
    .join('\n');
  console.log(`Transcription: ${transcription}`);

Maybe you are reaching some quota. I also did not found any issue related to this code. — Andie Vanille, Jun 27 '20 at 02:53
@AndieVanille It is a fresh account with billing enabled. Less than 60 minutes (the first 60 minutes are free every month) were used. The same sound file works well when uploaded outside of gcs. — Bosko Sinobad, Jun 28 '20 at 00:07
Can you post the [response](https://cloud.google.com/speech-to-text/docs/reference/rpc/google.cloud.speech.v1#longrunningrecognizeresponse) from the API? — IDMT, Jul 01 '20 at 15:38
@IsaacMiliani Both local and gcs respond with two transcriptions (alternatives). I provided the same file, but the gcs transcription is far inferior. Gcs one has 0.8605201840400696 and 0.4814569354057312 confidences, and the local one has 0.8711020946502686 and 0.7969886660575867. — Bosko Sinobad, Jul 02 '20 at 10:04
It's strange, I would say try to upgrade your client libraries and if the issue persists [create an issue in the public tracker](https://issuetracker.google.com/issues/new?component=451645&template=0) — IDMT, Jul 03 '20 at 14:40

Google speech to text api (reading from GCS broken)

0 Answers0