Is Google Cloud Speech to Text LongRunningRecognize really that slow?

Question

I've made a python script that splits about hour-long mp3 into 5 minute chunks, then converts them to flacs and uploads to google storage bucket and I'm doing the Speech to text recognition, however it's pretty slow. Every 5 minute chunk takes about 2 minutes. It took about 25 minutes to do a 53 minute long audio file Shouldn't it be far faster? This part of code does the Speech to text thing:

for i in range (0,x+1):
  client = speech.SpeechClient.from_service_account_json('credentials2.json')
  storage_uri = 'gs://MYBUCKET/sound-%s.flac' % i
  print (storage_uri)
  with io.open('sound-%s.flac' % i, 'rb') as audio_file:
    content = audio_file.read()
    audio = types.RecognitionAudio(content=content)
  audio = {"uri": storage_uri}
  enable_speaker_diarization = True
  config = types.RecognitionConfig(
    encoding = enums.RecognitionConfig.AudioEncoding.FLAC,
    sample_rate_hertz = 48000,
    language_code = 'pl-PL',
    audio_channel_count=1)
  operation = client.long_running_recognize(config,audio)
  response = operation.result()
  data = open("transkrypcja.txt","a")
  for result in response.results:
    alternative = result.alternatives[0]
    data.write(format(alternative.transcript) + '\n')
  data.write('\n\n\n\n\n')
  data.close()
  print('done')

Do you have a desired target speed of transformation? Your numbers seem to be showing 0.47 minutes per minute (lower is better). Do you have a base line of speed that you are comparing GCP against? — Kolban, Nov 12 '19 at 03:26
@Kolban I do not have even a concept of how much it should take, that's why I'm asking. In the first place I just thought it would be a matter of like 1-2 minutes to process the whole audio as it's only like 50MB after conversion to FLAC. — adammo, Nov 12 '19 at 11:19
No problems my friend. We'll see if we can't compare the speed against the norm. Please do realize that in your last post you felt that a 0.04 minutes per minute would be a good target. I myself have no idea on what is good vs bad. Experience has taught me to ask ... when I think performance is bad, what would I consider good and is that performance realistic. — Kolban, Nov 12 '19 at 14:20
Have you seen ... https://stackoverflow.com/questions/50364955/how-to-speed-up-google-cloud-speech — Kolban, Nov 12 '19 at 15:31

Is Google Cloud Speech to Text LongRunningRecognize really that slow?

0 Answers0