1

I have the few audio files which are the conversation between Customer and Agent stored successfully in S3. I try to convert the audio files as text using AWS transcribe and it is converting successfully.

But the weird part is, It is not even 60 % accurate, These are my configuration for the AWS Transcribe

1) Language code - English(Indian)
2) Audio Frequency - 8000HZ
3) Format - WAV

As per this guidelines (https://docs.aws.amazon.com/transcribe/latest/dg/limits-guidelines.html),
I set the Audio Frequency and Format to 8KHZ and Format as WAV Do I need to change any other parameters for improving the audio quality?

Any help is appreciated.

Thanks,
Harry

rchard2scout
  • 180
  • 1
  • 8
Harry
  • 3,072
  • 6
  • 43
  • 100

2 Answers2

0

Many thing can affect transcript quality, like background noise in audio, speaker overlap, speakers' accent. Higher quality audio usually gives better result.

Ruoyu Huang
  • 111
  • 4
0

You can try using custom vocabularies. You can create these custom vocabularies as mentioned here https://docs.aws.amazon.com/transcribe/latest/dg/how-vocabulary.html

This custom vocabulary list should some prior keywords which would be spoken and are specific to this domain. However, as per my experience these custom vocabularies overfit (incorrectly outputs the words in transcript from the custom vocabulary) at times.

ajay0221
  • 319
  • 1
  • 4
  • 10