0

I have one audio file in .wav format. but when I process it in speech to text model it will not giving me full text. Audio is in English with different slang.

please help me how can i will get the full text. my audio file is only for 0.15 second.

import speech_recognition as sr
r = sr.Recognizer()
os.chdir('.\Speaker_diarization_Reporting_matrix\cluster-chunks')
folder = '.\Speaker_diarization_Reporting_matrix\cluster-chunks'
filename = "1_speaker1.wav"
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
    audio_data = r.record(source)
        # recognize (convert from speech to text)
    text = r.recognize_google(audio_data)
    print(text)```
user
  • 61
  • 1
  • 4
  • if you model cannot process slang it is because it has not been trained over slang sentences... Not sure why you said that your audio file is only 0.15s, can you explain that? – Giuppox Jan 22 '21 at 07:19
  • Sorry, my audio is 0.08 second. what is the best approach to read the text from the .wav file ( automatically detect the language) i need pretrained model – user Jan 22 '21 at 07:26
  • i don't understand... if an entire sentence is super-concentrated in a very short audio it is normal that the model does not understand a few words, indeed I would be surprised if it understood even one. – Giuppox Jan 22 '21 at 08:21
  • complete audio is the 2 min long but i have trimmed the audio file in to the chunk.can you please help me how can we troubleshoot the issue can i share the audio file to you if you share me your email id? – user Jan 22 '21 at 12:55
  • it is written in my profile's bio – Giuppox Jan 22 '21 at 15:06
  • sent an email to you – user Jan 25 '21 at 12:26

0 Answers0