Real-time audio signal processing using python

Question

I have been trying to do real-time audio signal processing using 'pyAudio' module in python. What I did was a simple case of reading audio data from microphone and play it via headphones. I tried with the following code(both Python and Cython versions). Thought it works but unfortunately it is stalls and not smooth enough. How can I improve the code so that it will run smoothly. My PC is i7, 8GB RAM.

Python Version

import pyaudio
import numpy as np

RATE    = 16000
CHUNK   = 256
    
p               =   pyaudio.PyAudio()

player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, 
frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)

for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
    player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()

Cython Version

import pyaudio
import numpy as np

cdef int RATE   = 16000
cdef int CHUNK  = 1024
cdef int i      
p               =   pyaudio.PyAudio()

player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)

for i in range(500): #do this for 10 seconds
    player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()

Don't know what you mean by "stalls" and what you expect. There is nothing to be gained by using cython -there are no python calculations, everything is done by c code inside of libraries. You call it real-time, but use blocking IO - how should it work? Use the nonblocking version https://people.csail.mit.edu/hubert/pyaudio/docs/#example-callback-mode-audio-i-o — ead, Sep 24 '17 at 05:27
By 'stalls', I meant the audio breaks in between. How does blocking mode and non blocking differ?, Thank you for the link. — Sajil C K, Sep 24 '17 at 06:32
In your case "blocking" means, when it plays it does not record and when it records it does not play — ead, Sep 24 '17 at 10:52
@ead , while non-blocking can be used to wire input (microphone) to output (headset/speaker) directly, you cant do any processing on the audio as you have not access/control on it. For any mid-stream processing OP will need to use blocking version (which he is using). — Anil_M, Sep 27 '17 at 16:07
Is there a reason you didn't multithread the application? All the python real time audio processing examples I've seen have used multithreading. — Tyler Hilbert, Nov 20 '17 at 22:17

Anil_M · Accepted Answer · 2017-09-27T17:11:53.187

10

I believe you are missing CHUNK as second argument to player.write call.

player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)

Also, not sure if its formatting error. But player.write needs to be tabbed into for loop

And per pyaudio site you need to have RATE / CHUNK * RECORD_SECONDS and not RECORD *RATE/CHUNK as python executes * multiplication before / division.

for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
    player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)

stream.stop_stream()
stream.close()
p.terminate()

Finally, you may want to increase rate to 44100 , CHUNK to 1024 and CHANNEL to 2 for better fidelity.

edited Sep 27 '17 at 17:11

answered Sep 27 '17 at 16:39

Anil_M

10,893
6
47
74

Thanks for the help. I tried your suggestions. After including 'CHUNK' as argument to 'player' object, it was way better, but still getting the error 'ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred'. I did a bit of research and found out it was something related to buffer size. So I experimented with the 'CHUNK' size and found out 128 is a good value. It still get the same error but only for the first few seconds after that it is working perfectly fine. – Sajil C K Sep 28 '17 at 08:46
I tried to up-vote your answer but I don't have enough reputation to do so. – Sajil C K Sep 28 '17 at 11:11
It is customary to select answer on SO if it helps you towards your question. It helps close the loop and It also adds points to your account. See here on how to accept answer: https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work – Anil_M Sep 28 '17 at 12:52
Don’t need to. If u see so stats , many questions are not resolved even though valid answers are provided. This crates “noise” and skews stats. I was also guiding newcomers. Hope that helps. – Anil_M May 04 '18 at 18:13
A lot of online examples, including other questions, appear to be missing the argument `CHUNK` from `write()` - adding this solved my issue. I was getting weird / buzzy sound. Not same issue as question OP – FreelanceConsultant Jan 05 '20 at 20:50

score 8 · Answer 2 · answered May 27 '18 at 18:11

The code below will take the default input device, and output what's recorded into the default output device.

import PyAudio
import numpy as np

p = pyaudio.PyAudio()

CHANNELS = 2
RATE = 44100

def callback(in_data, frame_count, time_info, flag):
    # using Numpy to convert to array for processing
    # audio_data = np.fromstring(in_data, dtype=np.float32)
    return in_data, pyaudio.paContinue

stream = p.open(format=pyaudio.paFloat32,
                channels=CHANNELS,
                rate=RATE,
                output=True,
                input=True,
                stream_callback=callback)

stream.start_stream()

while stream.is_active():
    time.sleep(20)
    stream.stop_stream()
    print("Stream is stopped")

stream.close()

p.terminate()

This will run for 20 seconds and stop. The method callback is where you can process the signal : audio_data = np.fromstring(in_data, dtype=np.float32)

return in_data is where you send back post-processed data to the output device.

Note chunk has a default argument of 1024 as noted in the PyAudio docs: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open

score 5 · Answer 3 · answered Jun 29 '18 at 13:02

I am working on a similar project. I modified your code and the stalls now are gone. The bigger the chunk the bigger the delay. That is why I kept it low.

import pyaudio
import numpy as np

CHUNK = 2**5
RATE = 44100
LEN = 10

p = pyaudio.PyAudio()

stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)


for i in range(int(LEN*RATE/CHUNK)): #go for a LEN seconds
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    player.write(data,CHUNK)


stream.stop_stream()
stream.close()
p.terminate()

Real-time audio signal processing using python

3 Answers3