I have an audio file of 10 seconds in length. If I generate the spectrogram
using matplotlib
, then I get a different number of timesteps as compared to the spectrogram generated by librosa
.
Here is the code:
fs = 8000
nfft = 200
noverlap = 120
hop_length = 120
audio = librosa.core.load(path, sr=fs)
# Spectogram generated using matplotlib
spec, freqs, bins, _ = plt.specgram(audio, nfft, fs, noverlap = noverlap)
print(spec.shape) # (101, 5511)
# Using librosa
spectrogram_librosa = np.abs(librosa.stft(audio,
n_fft=n_fft,
hop_length=hop_length,
win_length=nfft,
window='hann')) ** 2
spectrogram_librosa_db = librosa.power_to_db(spectrogram_librosa, ref=np.max)
print(spectrogram_librosa_db.shape) # (101, 3676)
Can someone explain it to me why is there a huge diff in the time steps and how to make sure that both generate the same output?