I'm using Google's nodejs-speech package to use the longRunningRecognize
endpoint/function in Google's Speech API.
I've used both v1
and v1p1beta
, and run into an error with longer files. (48 mins is as long as I've tried, and 15 mins causes the same problem, though 3 mins does not). I've tried both the promise pattern and separating the request into two parts -- one to start the longRunningRecognize process, and the other to check on results after waiting. The error is shown below the code samples for both.
Example promise version of request:
import speech from '@google-cloud/speech';
const client = new speech.v1p1beta1.SpeechClient();
const audio = {
uri: 'gs://my-bucket/file.m4a'
};
const config = {
encoding: 'AMR_WB',
sampleRateHertz: 16000,
languageCode: 'en-US',
enableWordTimeOffsets: true,
enableSpeakerDiarization: true
};
const request = {
audio,
config
};
client.longRunningRecognize(request)
.then(data => {
const operation = data[0];
return operation.promise();
})
.then(data => {
const response = data[0];
const results = response.results;
const transcription = results
.filter(result => result.alternatives)
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(transcription);
})
.catch(error => {
console.error(error);
});
(I've since closed the tab with the results, but I think this returned an error object that just said { error: { code: 13 } }
, which matches the below, more descriptive error).
Separately, I've tried a version where instead of chaining promises to get the final transcription result, I collect the name
from the operation, and make a separate request to get the result.
Here's that request code:
... // Skipping setup
client.longRunningRecognize(request)
.then(data => {
const operation = data[0];
console.log(operation.latestResponse.name);
})
.catch(error => {
console.error(error);
});
When I hit the relevant endpoint (https://speech.googleapis.com/v1p1beta1/operations/81703347042341321989?key=ABCD12345
) before it's had time to process, I get this:
{
"name": "81703347042341321989",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1p1beta1.LongRunningRecognizeMetadata",
"startTime": "2018-08-16T19:33:26.166942Z",
"lastUpdateTime": "2018-08-16T19:41:31.456861Z"
}
}
Once it's fully processed, though, I've been running into this:
{
"name": "81703347042341321989",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1p1beta1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-08-16T17:20:28.772208Z",
"lastUpdateTime": "2018-08-16T17:44:40.868144Z"
},
"done": true,
"error": {
"code": 13,
"message": "Server unavailable, please try again later."
}
}
I've tried with shorter audio files (3 mins, same format and encoding), and the above processes both worked.
Any idea what's going on?