I've been using Google Cloud Video Intelligence for text detection. Now, I want to use it for speech transcription so I added SPEECH_TRANSCRIPTION
feature to TEXT_DETECTION
but the response only contains result for one feature, the last one.
const gcsUri = 'gs://path-to-the-video-on-gcs'
const request = {
inputUri: gcsUri,
features: ['TEXT_DETECTION', 'SPEECH_TRANSCRIPTION'],
};
// Detects text in a video
const [operation] = await video.annotateVideo(request);
const [operationResult] = await operation.promise();
const annotationResult = operationResult.annotationResults[0]
const textAnnotations = annotationResult.textAnnotations
const speechTranscriptions = annotationResult.speechTranscriptions
console.log(textAnnotations) // --> []
console.log(speechTranscriptions) // --> [{...}]
Is this a case where annotation is performed on only one feature at a time?