I am new to both python and librosa. I am trying to follow this method for a speech recognizer: acoustic front end
My code:
import librosa
import librosa.display
import numpy as np
y, sr = librosa.load('test.wav', sr = None)
normalizedy = librosa.util.normalize(y)
stft = librosa.core.stft(normalizedy, n_fft = 256, hop_length=16)
mel = librosa.feature.melspectrogram(S=stft, n_mels=32)
melnormalized = librosa.util.normalize(mel)
mellog = np.log(melnormalized) - np.log(10**-5)
The problem is that when I apply librosa.util.normalize to variable mel, I expect values to be between 1 and -1, which they aren't. What am I missing?