1

I have a .wav file, I load it and I get the next spectrogram showing the spectrum in dB

https://i.stack.imgur.com/22TjY.png

Now I would like to know these values exactly because I want to compare with other wav file, for recognizing if these 4 values are there.

https://i.stack.imgur.com/Jun25.png

The source to generate that pictures (taken from other stackoverflow example)

## some stuff here

for i in range(0, int(RATE / CHUNK_SIZE * RECORD_SECONDS)):
    # little endian, signed shortdata_chunk
    data_chunk = array('h', stream.read(CHUNK_SIZE))
    if byteorder == 'big':
        data_chunk.byteswap()
    data_all.extend(data_chunk)

## some stuff here

Fs = 16000
f = np.arange(1, 9) * 2000
t = np.arange(RECORD_SECONDS * Fs) / Fs 
x = np.empty(t.shape)
for i in range(8):
x[i*Fs:(i+1)*Fs] = np.cos(2*np.pi * f[i] * t[i*Fs:(i+1)*Fs])

w = np.hamming(512)
Pxx, freqs, bins = mlab.specgram(data_all, NFFT=512, Fs=Fs, window=w, 
                noverlap=464)

#plot the spectrogram in dB
Pxx_dB = np.log10(Pxx)
pyplot.subplots_adjust(hspace=0.4)

pyplot.subplot(211)
ex1 = bins[0], bins[-1], freqs[0], freqs[-1]
pyplot.imshow(np.flipud(Pxx_dB), extent=ex1)
pyplot.axis('auto')
pyplot.axis(ex1)
pyplot.xlabel('time (s)')
pyplot.ylabel('freq (Hz)')

I "think" that the information is in Pxx but I don't know how to get it.

fler
  • 33
  • 4

1 Answers1

0

From the documentation, I gather that Pxx is a simple 2D numpy array.

You're interested in periodograms around 1s. Considering Pxx should have 512 columns and your sample is about 5s long, I'd take a slice somewhere around column 100: periodogram_of_interest = Pxx[:, 100]

Then find the 4 maxima. Unfortunately, each of those 4 frequencies has a finite width, so simply looking for the top 4 maxima will nog be as easy. However, assuming your signal is quite clean, there's a function in scipy.signal that will list all local extrema: argrelmax. You could play with the order argument of that function to reduce your search space.

With the values returned from that function, you could get the frequencies like this: freqs[those_4_indices].

Oliver W.
  • 13,169
  • 3
  • 37
  • 50
  • Why are we considering Pxx should have 512 columns? I have a fixe sample of 5 seconds. Also my signal is really dirty, I have noise from environment – fler Apr 07 '14 at 17:08
  • Sorry, that was a bad assumption on my part. As I have no idea how long `data_all` is, I created a (random) array, which in my case turned out to give a 2D array with 512 columns (lucky shot). It's best that you obtain the shape of your `Pxx` with `Pxx.shape` of course. Then take a slice about 20% through, because that's where (in your image) the signals have started. Also, even if your signal is noisy, using `argrelmax` will work, although you might have to help it by cutting your specgram up in slices (similar to your 2nd image). – Oliver W. Apr 07 '14 at 19:18