I'm trying to transcript a short interview audio file with Google Cloud Speech API (asynchronously) but it only transcribes the first half minute of the recording. I had several attempts with recordings longer than one minute and the results were the same. My question is, how can I achieve the full audio transcription for a given file?
You can find one of my use-cases here:
Upload the audio file:
POST https://speech.googleapis.com/v1beta1/speech:asyncrecognize?key={YOUR_API_KEY}
{
"config": {
"encoding": "LINEAR16",
"sampleRate": 16000,
},
"audio": {
"uri": "gs://protean-blend-146812.appspot.com/record__2017_02_02_12_02_17_greg_16000.wav",
}
}
Got the operation number in the response:
{
"name": "8977932499808116064"
}
Make a request with the operation number:
GET https://speech.googleapis.com/v1beta1/operations/8977932499808116064?key={YOUR_API_KEY}
Got the result:
{
"name": "8977932499808116064",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeMetadata",
"progressPercent": 100,
"startTime": "2017-02-02T11:21:41.346784Z",
"lastUpdateTime": "2017-02-02T11:23:03.150491Z"
},
"done": true,
"response": {
"@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse",
"results": [
{
"alternatives": [
{
"transcript": "McGregor you have any stories about being lost that you have all the good advice well let me know in the Golden Triangle drug trafficking across the border",
"confidence": 0.8535113
}
]
},
{
"alternatives": [
{
"transcript": "we came across this Village very very poor Village People and some of the people there were really unfriendly they just started throwing rocks and my friend and we couldn't talk so we backed away went back quickly up the hill",
"confidence": 0.9027881
}
]
},
{
"alternatives": [
{
"transcript": "and we are wondering you know where to go and luckily I can see in the distance there in one tree",
"confidence": 0.8931573
}
]
}
]
}
}
The links, where I made the requests (at 'Try it!' section):