Spleeter: Real-time Audio Vocal Removal Issue with 'spleeter' and PyAudio

Question

I have been using the spleeter library in Python to separate vocals from an audio file, and it works perfectly when processing a pre-recorded audio file. However, I am trying to implement real-time audio vocal removal using spleeter with PyAudio, but it doesn't seem to be working as expected. I have written the following code, but it's not producing the desired output. I need help from an expert to troubleshoot and resolve the issue.

from spleeter.separator import Separator
import multiprocessing
import pyaudio
import numpy as np

# Global variables
CHUNK_SIZE = 1024
SAMPLING_RATE = 16000
THIN_FACTOR = 0.5
vocals_data = bytes()

# Create PyAudio object
p = pyaudio.PyAudio()

# Define callback function for audio processing
def process_audio(in_data, frame_count, time_info, status):
    global vocals_data
    # Convert input data to numpy array
    audio_array = np.frombuffer(in_data, dtype=np.int16)
    
    # Perform vocal removal on the audio input
    # Pass the audio array as waveform to separate() method
    vocals = Separator('spleeter:2stems').separate(audio_array)
    
    # Convert vocals to audio data
    vocals_data = vocals['vocals'].flatten().astype(np.int16).tobytes()

    # Return processed data for output
    return vocals_data, pyaudio.paContinue

# Open stream for recording
stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=SAMPLING_RATE,
                input=True,
                output=True,  # Set output to True for an output stream
                frames_per_buffer=CHUNK_SIZE,
                stream_callback=process_audio)

# Start stream
stream.start_stream()

# Create stream for playback
playback_stream = p.open(format=pyaudio.paInt16,
                         channels=1,
                         rate=SAMPLING_RATE,
                         output=True)

# Play processed data in real-time
while stream.is_active():
    if len(vocals_data) >= CHUNK_SIZE:
        playback_stream.write(vocals_data[:CHUNK_SIZE])
        vocals_data = vocals_data[CHUNK_SIZE:]

# Stop streams
stream.stop_stream()
stream.close()
playback_stream.stop_stream()
playback_stream.close()

# Terminate PyAudio object
p.terminate()

if __name__ == '__main__':
    multiprocessing.freeze_support()

I tried implementing real-time audio vocal removal using spleeter and PyAudio. I expected the code to separate the vocals from the audio input in real-time and play the processed audio without any issues.

I have also tried using multiprocessing.freeze_support() but it didn't resolve the issue. Any help or suggestions would be greatly appreciated!

In what way does it not work as expected? What does it do that is problematic? — Jon Nordby, Apr 16 '23 at 14:48

Spleeter: Real-time Audio Vocal Removal Issue with 'spleeter' and PyAudio

0 Answers0