Issue with Amazon Transcibe

Question

I have a few recording files (audio files from Amazon connect .wav format) is stored in the S3 bucket.
I followed this link (https://aws.amazon.com/getting-started/tutorials/create-audio-transcript-transcribe/) to convert the audio files to Transcript using Amazon Transcribe.

It is successfully converted to Transcript.

NOTE: the voice in the audio files are very clear but the only thing is it was there in the decent Indian accent (Not in UK / US accent)

Surprisingly, It is not able to detect many words correctly. Most of the words were wrong

1) Is there a setting, I need to configure for detecting the Indian accent?
2) Is there anyone tested with US / UK accent and detected at least 80 percent correct?
3) Can anyone suggest What is the other tool anyone prefers for converting the audio to text?

Thanks,
Harry

Juned Ahsan · Accepted Answer · 2019-10-25T05:34:44.313

1

You can try to transcribe with Indian English with code

Indian English (en-IN)

Hope it does not start to misbehave for the parts that are in different accents like UK/US. But still worth a try. Otherwise you may need to think about splitting the audio in different parts, which obvioulsy will be cumbersome.

edited Oct 25 '19 at 05:34

answered Oct 25 '19 at 00:40

Juned Ahsan

67,789
12
98
136

I checked this already, But where do I need to specify this code? You mean in Amazon Transcribe API, If so could you point me an example - Appreciated – Harry Oct 25 '19 at 00:45
@Harry depends on what sdk are you using. But here is teh API reference https://docs.aws.amazon.com/transcribe/latest/dg/API_StartTranscriptionJob.html#transcribe-StartTranscriptionJob-request-LanguageCode – Juned Ahsan Oct 25 '19 at 01:01

Issue with Amazon Transcibe

1 Answers1