I am currently working on augmenting audio in Python. I've been using Librosa due to its speed and simplicity but need to fallback on PyDub for some other utilities such as applying gain.
Is there a mathematical way to add gain to the Numpy array provided with librosa.load? In PyDub it is quite easy but I have to constantly convert back between Pydub's get_array_of_samples() to np.array then to the proper 32 bit float representation on the [-1,1) scale (that Librosa uses by default). I'd rather keep it all in one library for simplicity.
Also a normalization of an audio signal to 0 db gain beforehand would be useful too. I am a bit new to a lot of the terminology used in audio signal processing.
This is what I am currently doing. Down the road I would like to make this a class method which starts with using librosa's numpy array, so if there is a way to mathematically add specified gain in a certain unit to a numpy array from librosa that would be ideal.
Thanks
import librosa
import numpy as np
from pydub import AudioSegment, effects
pydub_audio = AudioSegment.from_file(audio_file_path)
pydub_audio = pydub_audio.set_frame_rate(16000) # make file 16k khz frame rate
print("Original dBFS is {}".format(pydub_audio.dBFS))
pydub_audio = pydub_audio.apply_gain(20) # apply 20db of gain to introduce clipping
#pydub_audio = effects.normalize(pydub_audio)
print("New dBFS is {}".format(pydub_audio.dBFS))
pydub_array = pydub_audio.get_array_of_samples()
pydub_array = np.array(pydub_array)
print("PyDub audio type is {}".format(pydub_array.dtype))
pydub_array_32bitfloat = pydub_array.astype(np.float32, order = 'C') / 32768 # rescaling to between [-1, 1] like librosa
print("Rescaled Pydub type is {}".format(pydub_array_32bitfloat.dtype))
import soundfile as sf
sf.write(r"test_pydub_gain.wav", pydub_array_32bitfloat, samplerate = 16000, format = 'wav')