0

I'm trying to break an audio file into small subsections, and then perform speech recognition on each small subsection. To do this, I am splitting the file up using PyDub, and hoping to input it into the SpeechRecognition library. However, I want to do this without needing to save each small audio chunk to disk, and then re-read it. Hence, I want to do an in-memory conversion from the PyDub.AudioSegment object to a speech_recognition.AudioData object.

Is there any way to do this?

(I'm looking for a similar end result to this question)

I've already got the original audio split into segments, and stored in a list.

1 Answers1

1

Instead of pydub, try silero-vad. This works as per your requirement mentioned.

  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - [From Review](/review/late-answers/34910823) – Ram Chander Aug 31 '23 at 13:08