3

I am currently working on augmenting audio in Python. I've been using Librosa due to its speed and simplicity but need to fallback on PyDub for some other utilities such as applying gain.

Is there a mathematical way to add gain to the Numpy array provided with librosa.load? In PyDub it is quite easy but I have to constantly convert back between Pydub's get_array_of_samples() to np.array then to the proper 32 bit float representation on the [-1,1) scale (that Librosa uses by default). I'd rather keep it all in one library for simplicity.

Also a normalization of an audio signal to 0 db gain beforehand would be useful too. I am a bit new to a lot of the terminology used in audio signal processing.

This is what I am currently doing. Down the road I would like to make this a class method which starts with using librosa's numpy array, so if there is a way to mathematically add specified gain in a certain unit to a numpy array from librosa that would be ideal.

Thanks

import librosa
import numpy as np

from pydub import AudioSegment, effects

pydub_audio = AudioSegment.from_file(audio_file_path)
pydub_audio = pydub_audio.set_frame_rate(16000) # make file 16k khz frame rate

print("Original dBFS is {}".format(pydub_audio.dBFS))
pydub_audio = pydub_audio.apply_gain(20) # apply 20db of gain to introduce clipping
#pydub_audio = effects.normalize(pydub_audio)
print("New dBFS is {}".format(pydub_audio.dBFS))

pydub_array = pydub_audio.get_array_of_samples()
pydub_array = np.array(pydub_array)
print("PyDub audio type is {}".format(pydub_array.dtype))

pydub_array_32bitfloat = pydub_array.astype(np.float32, order = 'C') / 32768 # rescaling to between [-1, 1] like librosa
print("Rescaled Pydub type is {}".format(pydub_array_32bitfloat.dtype))

import soundfile as sf
sf.write(r"test_pydub_gain.wav", pydub_array_32bitfloat, samplerate = 16000, format = 'wav')
Coldchain9
  • 1,373
  • 11
  • 31
  • Applying gain is just amplifying the signal, which is just multiplying with a scaling factor. The scaling factor can computed from the decibel value. – Jon Nordby Jun 25 '20 at 13:03
  • Normalization to -0dB means scaling the data such such that the max values are at 1.0/-1.0 – Jon Nordby Jun 25 '20 at 13:04
  • @jonnor thank you. Gain is quite simple now that I've learned more. Also thanks for the point on that normalizing point. I suppose it is more of a "scaling" rather than a normalization though, correct? A normalization would be more involved, like using peak amplitude, RMS ,etc.? – Coldchain9 Jun 25 '20 at 19:19
  • What I proposed above is using the peak amplitude – Jon Nordby Jun 25 '20 at 20:05
  • @jonnor how would I go about doing so mathematically? Without the use of PyDub, that is. I would like to understand what is going on mathematically. – Coldchain9 Jun 25 '20 at 20:16

1 Answers1

0

thinking about it, (if i am not wrong), mathematicaly the gain is: dBFS = 20 * log (level2 / level1) so i would multiply all elements of the array by 10**(dBFS/20) to apply the gain

  • 1
    Could you ask an actual question. Is this just an assement. Which language are you using? What is your code, if there is any? – Frederic Perron Jul 02 '21 at 18:59
  • I was trying to answer the question above : "Is there a mathematical way to add gain to the Numpy array provided with librosa.load?" . The language is Python. If I would have to apply a gain (dBFS) to a numpy array, I would mutliply all element of the numpy array by 10**(dBFS/20) – SebAcou Jul 03 '21 at 12:07