0

I am transcribing an audio file using the small model of whisper.. Everything is perfect while transcribing the audio.. But my question is, The ' WHISPER' model does auto-correction in grammar..

Here is the code I am using:

from whisper import load_audio

fname = 'audioFile'
model = load_model('small')

result = model.transcript(fname)
print(result)

The problem is : It corrects the grammar mistakes while transcribing audio file.

My audio file contains :

"In this course, will we...."

But it returns:

"In this course. we will..."

Here I am attaching the link to the audio file::

'https://drive.google.com/file/d/1lNjjKvZxJ8G_pH7i19oMhxLf2Oj5loiE/view?usp=share_link'

I also googled the question and found one question here but it is unanswered..

Any help or suggestions will be very helpful. Thank you!!

mann
  • 1
  • 2
  • You're saying it auto-corrects, which suggests that you think it recognises the audio correctly and then corrects the grammar. However, have you considered that the model decodes the audio as if there was no grammar error made? Sound to text models will always be biased to hear text without errors, since that's what is to be expected. Does it *always* correct grammer errors, especially when they can't be 'misheard' this way? – Grismar May 16 '23 at 06:39
  • No not always., but sometimes it corrects the grammar..How to avoid it?? – mann May 16 '23 at 06:53
  • If the problem is what I suggested, there's no way to avoid it, unless you can get the model to be more allowing of people making mistakes in the audio. It's like when you're listening to someone in a loud bar - even though they might make a mistake and say 'will we ...', you're more likely to make out 'we will ...' because that's what would make more sense. However, the model may have some parameters that affect the precision, someone with deeper knowledge of Whisper may know. – Grismar May 16 '23 at 06:57

0 Answers0