5

Consider the following code

import numpy as np
import matplotlib.pyplot as plt
from librosa import cqt

s = np.linspace(0,1,44100)
x = np.sin(2*np.pi*1000*s)
fmin=500

cq_lib = cqt(x,sr=44100, fmin=fmin, n_bins=40)

plt.imshow(abs(cq_lib),aspect='auto', origin='lower')
plt.xlabel('Time Steps')
plt.ylabel('Freq bins')

It will give a spectrogram like this

enter image description here

When you look closely at the beginning and the ending of the spectrogram, you can see that there's some defects there.

When plotting out only the first and the last time step, you can see the frequency is not correct.

First Frame

plt.plot(abs(cq_lib)[:,0])
plt.ylabel('Amplitude')
plt.xlabel('Freq bins')
plt.tick_params(labelsize=16)

enter image description here

Last and 2nd Last frame comparison

plt.plot(abs(cq_lib)[:,-1])
plt.plot(abs(cq_lib)[:,-2])
plt.legend(['last step', '2nd last step'], fontsize=16)
plt.ylabel('Amplitude')
plt.xlabel('Freq bins')
plt.tick_params(labelsize=16)

enter image description here

My attempt to solve it

According to my knowledge, it should be due to padding and putting the stft window at the center. But it seems cqt doesn't support the argument center=False.

cq_lib = cqt(x,sr=44100, fmin=fmin, n_bins=40,center=False)

TypeError: cqt() got an unexpected keyword argument 'center'

Am I doing anything wrong? How to make center=False in cqt?

Raven Cheuk
  • 2,903
  • 4
  • 27
  • 54

1 Answers1

4

I think you might want to try out pad_mode which is supported in cqt. If you checkout the np.pad documentation, you can see available options (or see the end of this post). With the wrap option, you get a result like this, though I suspect the phase is a mess, so you should make sure this meets your needs. If you are always generating your own signal, you could trying using the <function> instead of one of the available options.

import numpy as np
import matplotlib.pyplot as plt
from librosa import cqt

s = np.linspace(0,1,44100)
x = np.sin(2*np.pi*1000*s)
fmin=500

cq_lib = cqt(x,sr=44100, fmin=fmin, n_bins=40, pad_mode='wrap')

plt.imshow(abs(cq_lib),aspect='auto', origin='lower')
plt.xlabel('Time Steps')
plt.ylabel('Freq bins')

enter image description here

If you look at the first frame and last two frames you can see it now looks much better. I tried this with librosa 0.6.3 and 0.7.0 and the results were the same.

enter image description here

enter image description here

Try some of the options and hopefully you can find one of the padding options that will do the trick: np.pad options: ‘constant’, ‘edge’, ‘linear_ramp’, ‘maximum’, ‘mean’,‘median’,‘minimum’, ‘reflect’, ‘symmetric’, ‘wrap’, ‘empty’, <function>

  • 1
    I see. But I am thinking about what causes two peaks in the beginning step. Even if you do padding, the majority of the single would still be of the frequency 1000Hz. But why does the first frame got less signal at 1000Hz, instead, two peaks appears at the adjutant locations. I will like to also know what is the physics behind it. – Raven Cheuk Aug 17 '19 at 04:03
  • If I'm interpreting it correctly, the two waves represented in your graph of the first frame, with appropriate phase relationship, would approximate the onset of the signal. –  Aug 17 '19 at 04:18