0

I am trying to do a spectrogram analysis on a song. Currently I have about a 10 second clip from a song and am attempting to find the local peaks.

All I really want is to have a scatter plot showing local maxima within some NxN neighborhood worth of amplitudes

[y,fs] = audioread('audio_file.wav');
window = hamming(512);
num_overlap = 256;
nfft = 1024;
[S,F,T,P] = spectrogram(y(:,1), window, num_overlap, nfft, fs, 'yaxis');
surf(T,F,10*log10(P), 'edgecolor', 'none'); axis tight; view(0, 90); colormap hot;

This results in the below image:

enter image description here

Where the x-axis is of course time [0,~10], y-axis is frequency [0,22.5 KHz] and the z-axis is the amplitude

Now What I would like to do is create a 3D scatter plot over this surf to show where the peaks are. The dimensions of S, F, T, P are
S: 513 x 1770 complex double
F: 513 x 1 double
T: 1 x 1770 double
P: 513 x 1770 double

Now this is where I am pretty sure I am doing something wrong or not understanding MATLAB entirely.

msk = true(3,3,3);
msk(2,2,2) = false;
dil = imdilate(10*log10(P), msk);
M = 10*log10(P) > dil;

My understanding is that will get me a 1 wherever my local peak is

Now let's just say that amp = 10*log10(P), I would like to just be able to call scatter3 the same way I called surf, like so:

scatter3(T, F, amp(M))

but of course I get X, Y and Z must be vectors of the same length. I suppose that makes sense to me so I decided to repeat the values as many times as they needed to be to get the axes equal.

Tr = repelem(T, 513)';
Fr = repelem(F, 1770);
Zr = reshape(amp, [908010, 1]);
[pks, locs] = findpeaks(Zr);
scatter3(Tr(locs), Fr(locs), Zr(locs));

This results in a 3D scatter plot like so:

enter image description here

And that is definitely not right because there should be many local peaks throughout the amplitude shown. I'm not really sure what I'm doing wrong, but I'm also almost positive that there's an easier way to achieve what I want. All I really want is to have a scatter plot showing local maxima within some NxN neighborhood worth of amplitudes

Chrispresso
  • 3,660
  • 2
  • 19
  • 31
  • Would it be possible to get access to that audio clip? I'd like to be able to reproduce your graph as well as help you actually plot what you want. Also, you aren't specifying the mask properly. You actually want the centre element to be `true` and not `false`. This technique is called **non-maxima suppression**, which ensures that a window's centre element is the largest value and if it isn't, suppress this point. This is exactly what you are after when you want to find local peaks in a `N x N x N` 3D neighbourhood of elements. – rayryeng Apr 26 '16 at 06:16

1 Answers1

1

If I understand want you want, you have a matrix M with local peaks and your want to draw scatter in the locations of the peaks. You can get the row\col of each peak using find and the linear index using sub2ind:

[Fi,Ti] = find(10*log10(P) > dil);
Pi = sub2ind(size(P),Fi,Ti);

scatter3(T(Ti),F(Fi),amp(Pi));
ThP
  • 2,312
  • 4
  • 19
  • 25