Is there a way to create a simple user-interactive music synth in python

Question

I am currently trying to write a python synthesizer using either pygame.mixer or sounddevice to output the samples of a sine wave I've created in a numpy array. The duration of the wave has to be stated before creating the sine wave for example: sin(frequency * 2 * Pi * duration) therefore how do you play this sound at the duration of the users key-press.

There aren't many articles on this for python when I read up that seem easy to understand so any help would be appreciated.

Also if someone could explain or give an example of how the sounddevice.Stream or sounddevice.RawStream using python buffer objects works and if it would help in my situation that would be much appreciated.

I have already tried using sounddevice.play() but this seems very basic for what I'm trying to achieve. I have also tried creating a small segment of the sine wave and looping it for the user input but this would not work when i come to modulating that sound.

Another reason I don't like using sounddevice.play() is because you need to delay the program as I have used sounddevice.wait() as if not the program runs to the end without playing anything.

When watching this video ... https://www.youtube.com/watch?v=tgamhuQnOkM ... which uses c++ to program a synth he uses a separate module which I think runs a background thread but his module takes each sample separately rather than as an array.

I've also tried using pygame.sndarray.make_sound(). This is an example of the things I'd like to do when/if the synth works:

            import numpy as np # download
            import sounddevice as sd # download
            import time

            stream = []

            # Main Controls
            sps = 44100 # DON'T CHANGE

            carrier_hz = 440.0

            duration_s = 1.0

            atten = 0.3

            def amplitudeMod(t_samples, carrier):
                # Modulate the amplitude of the carrier
                ac = 0.2 # amplitude 0 = min, 1 = max
                ka = 1.0 # range of change 0.1 = less, 1.0 = most
                modulator_hz = 0.0 # frequency of modulation 20hz max
                modulator = np.sin(2 * np.pi * modulator_hz * t_samples / sps)
                envelope = ac * (1.0 + ka * modulator)
                return carrier * envelope

            def frequencyMod(t_samples, sps):
                # Modulate the frequency of the carrier
                k = 50.0 # range of change 0.1 = less, ?? = most
                modulator_hz = 10.0 # frequency of modulation
                carrier2 = 2 * np.pi * t_samples * carrier_hz / sps
                modulator = k * np.sin(2 * np.pi * t_samples * modulator_hz / sps)
                return np.cos(carrier2 + modulator)

            # Create carrier wave
            t_samples = np.arange(duration_s * sps)
            carrier = np.sin(2 * np.pi * carrier_hz * t_samples / sps)

            choice = input("1) Play sine\n2) Play amplitude modulation\n3) Play frequency modulation\n;")
            if choice == "1":
                output = carrier
            if choice == "2":
                output = amplitudeMod(t_samples, carrier)
            if choice == "3":
                output = frequencyMod(t_samples, sps)

            # Output

            output *= atten

            sd.play(output, sps)
            sd.wait()
            sd.stop()

Is there any way of creating this as a pygame key event which plays the sine wave only when the key is pressed then stops when the key is released.

DonH · Answer 1 · 2023-05-02T02:37:22.280

After trying a few libraries (including pygame, which does not seem to easily support generating and changing the audio stream on the fly), I was able to make a controllable tone generator using the Python sounddevice library. I'm running MacOS Monterey (Intel) and Python 3.11.

The example below includes a minimal tkinter GUI which sends key press and release events to the ToneGenerator
The ToneGenerator class starts a sounddevice.OutputStream() which runs for the duration of program execution
Key presses call ToneGenerator.note_event() to pass the event to the ToneGenerator. These change the frequency, and start the ramp-up or ramp-down of a simple attack-release envelope. The envelope ramp times are set in the _get_scaler_envelope function.
This demonstrates that you can modify the stream (frequency and amplitude) while sounddevice is streaming the audio.
To keep the example short (and because I'm still working on further capability), I've made a number of simplifying assumptions documented in comments.
Finally, I've tried running several ToneGenerator objects concurrently to play multiple notes at the same time (although not using this keyboard interface). This implementation appears to provide at least some degree of polyphony, but leaves the problem of routing keys to the ToneGenerators to be addressed in further development. I've not tested polyphony thoroughly or done any benchmarking.
Long, but minimal, code example follows (I hope - I'm new to this site and it may take a few times to post the code correctly)!
Make sure you call ToneGenerator with the correct device number for your computer (the program prints a list of devices to your console).

""" Minimal example music synthesis using sounddevice library
    Adapted from:  https://python-sounddevice.readthedocs.io/en/0.4.6/examples.html#play-a-sine-signal
    plays notes when key pressed and stops at key release with fixed velocity (set in self.amplitude)
    sine and saw are provided
    simple attack-release envelope
    monophonic - only plays one key at a time
    it appears that multiple Tone classes can be run in parallel for polyphony if a key router is added
    Simplifications to keep this example short:
        monophonic and does not allow changing note while note is playing (it would cause popping)
        envelop attack and release take at least a whole sample (11.6ms) rather than starting mid-sample
        no modulators (other than envelope) and no filters
        GUI is only used for non-blocking keyboard input - all other parameters are coded
        GUI interprets auto-repeating keys (when held) as separate press and release events (no debouncing)
    Intended to answer:
    https://stackoverflow.com/questions/54641717/is-there-a-way-to-create-a-simple-user-interactive-music-synth-in-python
"""

import sounddevice as sd  # https://python-sounddevice.readthedocs.io/en/0.4.6/
import numpy as np
import time
import tkinter as tk


class GUI(tk.Tk):
    def __init__(self, note_command):
        super().__init__()
        self.note_command = note_command  # called upon key press and release
        self.keys = {'c': 60, 'd': 62, 'e': 64, 'f': 65, 'g': 67, 'a': 69, 'b': 71, 'o': 72}  # midi notes (o is high c)
        tk.Label(self, text='press and release a key to play a note:\nc, d, e, f, g, a, b, or o for high c').\
            pack(padx=10, pady=10)
        self.message_label = tk.Label(self, font=('Helvetica', 36), width=20)
        self.message_label.pack(padx=20, pady=20)

        for key in self.keys:
            self.bind(f'<{key}>', self.key_press_event)
            self.bind(f'<KeyRelease-{key}>', self.key_release_event)

    def key_press_event(self, event):
        self.message_label.config(text=f'Key {event.char} (MIDI {self.keys[event.char]}) pressed')
        self.note_command(self.keys[event.char], True)

    def key_release_event(self, event):
        self.message_label.config(text=f'Key {event.char} (MIDI {self.keys[event.char]}) released')
        self.note_command(self.keys[event.char], False)


class ToneGenerator:
    @staticmethod
    def list_devices(): print(sd.query_devices())  # call to get list of available audio devices

    @staticmethod
    def note_to_f(midi_note: int): return 440.0 * 2 ** ((midi_note - 69) / 12)

    def __init__(self, device: int, waveform: str = 'sine'):
        self.device = device  # sd device
        self.waveform = waveform
        self.frequency = 440.0  # frequency in Hz of note (can't be zero, so set to any value before first note)
        self.amplitude = 0.0  # 0.0 <= amplitude <= 1.0  amplitude of note
        self.stream = None  # sd.OutputStream object
        self.stream_start_time = None
        self.prev_callback_time = None
        self.note_on_time = None  # for envelope
        self.note_off_time = None  # for envelope
        self.start_idx = 0  # index for callbacks
        self.samplerate = 44100

        def callback(outdata, frames, time, status):  # callback from sd.OutputStream
            if self.prev_callback_time is None:
                self.prev_callback_time = self.stream.time
            elapsed = self.stream.time - self.prev_callback_time
            self.prev_callback_time += elapsed
            np_env: np.ndarray = self.get_envelope(frames, elapsed)
            t = (self.start_idx + np.arange(frames)) / self.samplerate
            t = t.reshape(-1, 1)

            if self.waveform == 'sine':
                outdata[:] = np_env * np.sin(2 * np.pi * self.frequency * t)
            elif self.waveform == 'saw':
                outdata[:] = np_env * 2 * (t % (1/self.frequency) * self.frequency - .5)
            else:
                raise ValueError(f'ToneGeneraotor: invalid waveform {self.waveform}')
            self.start_idx += frames

        self.stream = sd.OutputStream(device=device, channels=1, callback=callback)
        self.stream.start()  # returns immediately, stream continues  until killed

    def note_event(self, note: int, press: bool = True):
        """ note is midi note number, press indicates whether key pressed or released
        """
        if press:
            self.note_on_time = time.time()
            self.note_off_time = None
            self.frequency = ToneGenerator.note_to_f(note)
            self.amplitude = 1.0  # computer keys aren't velocity sensitive!
        else:
            self.note_on_time = None
            self.note_off_time = time.time()

    def get_envelope(self, frames: int, sample_time: float) -> np.ndarray:
        current = self._get_scaler_envelope()
        previous = self._get_scaler_envelope(time_delta = -sample_time)
        return np.linspace(previous, current, num=frames).reshape((frames, 1))

    def _get_scaler_envelope(self, time_delta: float = 0.0) -> float:  # helper for get_envelope
        attack_time = .01  # seconds
        release_time = .4  # seconds
        env = 0.0
        if self.note_on_time is not None:
            env = max(0, min((1.0, (time.time() + time_delta - self.note_on_time) / attack_time)))
        elif self.note_off_time is not None:
            env = min(1, max((0, 1 - (time.time() + time_delta - self.note_off_time) / release_time)))
        return env * self.amplitude


if __name__ == '__main__':
    ToneGenerator.list_devices()
    sd = ToneGenerator(device=5, waveform='saw')  # set device based on output from ToneGenerator.list_devices()
    app = GUI(sd.note_event)
    app.mainloop()

Is there a way to create a simple user-interactive music synth in python

1 Answers1