Python: Eliminating gaps between segments of recorded audio

Question

I am using Python sounddevice library to record audio, but I can't seem to eliminate ~0.25 to ~0.5 second gaps between what should be consecutive audio files. I think this is because the file writing takes up time, so I learned to use Multiprocessing and Queues to separate out the file writing but it hasn't helped. The most confusing thing is that the logs suggest that the iterations in Main()'s loop are near gapless (only 1-5 milliseconds) but mysteriously the audio_capture function is taking longer than expected even tho nothing else significant is being done. I tried to reduce the script as much as possible for this post. My research has all pointed to this threading/multiprocessing approach, so I am flummoxed.

Background: 3.7 on Raspbian Buster I am dividing the data into segments so that the files are not too big and I imagine programming tasks must deal with this challenge. I also have 4 other subprocesses doing various things after.

Log: The audio_capture part should only take 10:00

08:26:29.991 --- Start of segment #0
08:36:30.627 --- End of segment #0     <<<<< This is >0.6 later than it should be
08:36:30.629 --- Start of segment #1   <<<<< This is near gapless with the prior event

Script:

import logging
import sounddevice
from scipy.io.wavfile import write
import time
import os
from multiprocessing import Queue, Process

# this process is a near endless loop
def main():
    fileQueue = Queue()
    writerProcess = Process(target=writer, args=(fileQueue,))
    writerProcess.start()
    for i in range(9000):
        fileQueue.put(audio_capture(i)) 
    writerProcess.join()

# This func makes an audio data object from a sound source
def audio_capture(i): 
    cycleNumber = str(i)
    logging.debug('Start of segment #' + cycleNumber)
    # each cycle is 10 minutes at 32000Hz sample rate
    audio = sounddevice.rec(frames=600 * 32000, samplerate=32000, channels=2) 
    name = time.strftime("%H-%M-%S") + '.wav' 
    path = os.path.join('/audio', name)
    sounddevice.wait()
    logging.debug('End of segment #' + cycleNumber)
    return [audio, path]
    
# This function writes the files.
def writer(input_queue):
    while True:
        try:
            parameters = input_queue.get()
            audio = parameters[0]
            path = parameters[1]
            write(filename=path, rate=32000, data=audio)
            logging.debug('File is written')
        except:
            pass

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, format='%(asctime)s.%(msecs)03d --- %(message)s', datefmt='%H:%M:%S',handlers=[logging.FileHandler('/audio/log.txt'), logging.StreamHandler()])
    main()

score 1 · Accepted Answer · answered Aug 20 '20 at 15:43

1

The documentation tells us that sounddevice.rec() is not meant for gapless recording:

If you need more control (e.g. block-wise gapless recording, overlapping recordings, …), you should explicitly create an InputStream yourself. If NumPy is not available, you can use a RawInputStream.

There are multiple examples for gapless recording in the example programs.

answered Aug 20 '20 at 15:43

Matthias

4,524
2
31
50

Thank you, Matthias! I am looking at the examples and learning about asyncio and callbacks to understand what they are doing. Are you referring to the examples called `rec_unlimited.py` and `rec_gui.py`? The others appear to be for playback or unusual functions and I wanted to check my initial understanding since the readme didn't say much and there's a lot of programming structures/syntax I don't understand yet. – rfii Aug 25 '20 at 20:24
Yes, `rec_unlimited.py` should be a good starting point. Instead of running your `audio_capture()` function in the main thread, you should run the code (probably with some adaptations) in the audio callback function. Then you can do the writing in the main thread. You most likely don't need the `multiprocessing` module. In more complicated situations you might want to use the `threading` module to create a separate writing thread (as shown in the `rec_gui.py` example). However, you should try it first without creating additional threads. – Matthias Aug 26 '20 at 09:20

score 1 · Answer 2 · answered Aug 22 '20 at 19:07

1

Use Pyaudio, open a non-blocking audio-stream. you can find a very good basic example on the Pyaudio documentation frontpage. Choose a buffer size, I recommend 512 or 1024. Now just append the incoming data to a numpy array. I sometimes store up to 30 seconds of audio in one numpy array. When reaching the end of a segment, create another empty numpy array and start over. Create a thread and save the first segment somewhere. The recording will continue and not one sample will be dropped ;)

Edit: if you want to write 10 mins in one file, I would suggest just create 10 arrays á 1 minute and then append and save them.

answered Aug 22 '20 at 19:07

Arjaan Auinger

54
3

1

FYI, non-blocking audio streams can also be created with the `sounddevice` module. Just use `sounddevice.InputStream` or `sounddevice.RawInputStream` as mentioned in my answer. – Matthias Aug 25 '20 at 19:45
Thank you! I will try this. Conceptually, it seems similar to what I've done with sounddevice except with pyaudio. Is that correct or am I misunderstanding something? With sounddevice, I have a thread that does audio capture and a thread that does file writing. In other words, I am trying to learn why your solution works and mine does not so that I can avoid it in the future. Thank you! – rfii Aug 25 '20 at 19:47
1

PyAudio and the `sounddevice` module are the same in this respect (because they are both wrapping the PortAudio library). Both let you define a callback function which will automatically be called from a separate audio thread (you don't have to create a Python thread for that). The `sounddevice` module has additional convenience functions like `sounddevice.rec()`, but you shouldn't use them in this case. – Matthias Aug 26 '20 at 09:11

Python: Eliminating gaps between segments of recorded audio

2 Answers2