2

I am using scipy.signal.spectrogram() for analysing a dataset containing values for a current. My input to the function is as follows:

f, t, Sxx = signal.spectrogram(y, fs)

(for plotting in subplot 3 (from the top) I use plt.pcolormesh(t, f, Sxx))

Where y is a list of 10002 values, containing the y values for the green graph in the first plot (from the top). fs = 1/T where T = x[1]-x[0] where x is the list of x values (time) belonging to the y-values (current).

My problem is that

t[-1]-t[0] != x[-1]-x[0]

Meaning: I want to compare plot 3 with the green graph in plot 1, and when these two do not range over the same time-span, the spectrogram becomes useless. You can see from the picture that total_length_x > total_length_t

enter image description here

Why is this so? And what can I do to make the spectrum range over the same time-span as my original data?

ali_m
  • 71,714
  • 23
  • 223
  • 298
roggjah
  • 21
  • 1
  • 5
  • 1
    Please read http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.spectrogram.html to get an idea what this function returns. To generate spectrogram, a set of points has to be used (window) and FFT of these values is calculated. By default, you are using 256-point non-overlapping window. This means, your 1002 data set will provide 40 sets. I assume you have got 40 points of t. So, for each Sxx column (and every t), this corresponds to 256 of the original input time values. – hesham_EE Oct 03 '16 at 15:33
  • I have read the document, and I still don't understand the output. Here are some numbers: len(t) = 44. t[-1]-t[0] = 1.92639999. x[-1]-x[0] = 2.0001997. So you see, I am "missing" 0.0738 microsec of data. – roggjah Oct 05 '16 at 09:06
  • I made a mistake above about the overlap. In the document, if not supplied, overlap is assumed to be `256/8 = 32`. This means, your points will be `1 + floor((10002 - 256) / (256 - 32)) = 44`. The `t` points are the mid-range points of the window, i.e. `t[0] = x[128]` and for `n > 0`, `t[n] = x[128 + (n - 1) * (256 - 32)]` . I hope this explains it. – hesham_EE Oct 05 '16 at 13:40

1 Answers1

0

I wrote some code to explain my comments above about the sizes of the data:

#!/usr/bin/env python

import numpy as np
import scipy.signal
from scipy.signal import spectrogram

WINDOW_LEN = 256
OVERLAP_LEN = WINDOW_LEN / 8
DATA_LEN = 10002
DURATION = 2.0001997
fs =  (DATA_LEN - 1) / DURATION
eps = 1/(fs * 1000.0)

y = np.random.rand(DATA_LEN)
x = np.arange(0, DURATION + 1/fs, 1/fs)

f, t, Sxx = spectrogram(y, fs=fs, nperseg=WINDOW_LEN)

T = np.zeros( int(1 +  np.floor((len(y) - WINDOW_LEN) / (WINDOW_LEN - OVERLAP_LEN))) )

T[0] = x[WINDOW_LEN / 2]
T[1:] = [x[WINDOW_LEN / 2 + (n + 1) * (WINDOW_LEN - OVERLAP_LEN)] for n in np.arange(0,len(T) - 1)]


if all(t - T < eps):
    print (t - T)
    print "All are fine"
    print x[-1] - x[0]
    print t[-1] - t[0]
    print T[-1] - T[0]
else:
    print t
    print T
    print "Wrong estimates"
hesham_EE
  • 1,125
  • 13
  • 24
  • Thank's alot for this, this explained the output very nicely! My question now changed to: how to plot these values correctly? Because when I am plotting `plt.pcolormesh(t,f,Sxx)`, it seems wrong. Now my plot starts at `time = t[0]`, instead of starting at `time = 0`. So how can I correctly scale this plot to be a 1:1 timescale with my original dataset? (ofc. I can correct the offset with `t = t + x[0]` if `t[0]=0`.) But this will not fix my problem as tho the total time is different in them. – roggjah Oct 06 '16 at 10:28
  • `t,f,Sxx = self.spect_Ihv()` #basically calls signal.spectrogram() and returns the values `new_t = []` #changing the bins `for i in range(0, 45): jada = 0.045454545454545456 #2microsec/44bins = binsize new_t.append(jada * i)` `new_t = new_t + self.Ihv.x[0]` #adjusting to original datastring `ax3.pcolormesh(new_t, f, Sxx)` If I do this it scales perfectly, but I am not sure if what I am doing is "legal"?. Sorry for the horrible formatting... – roggjah Oct 06 '16 at 10:35
  • I can't read your code properly. But what I can say is that, each column of `Sxx`, returning from spectrogram, represents the time samples that are **centred** around `t`. The problem is that the first sample (and may be the last), has to be dealt with differently. With overlap, each `t` sample represents `256 - 256 / 8` time samples, `x`. I hope this helps. – hesham_EE Oct 07 '16 at 13:16
  • [Here is the same raw code used in the original post (removed two subplots)](http://imgur.com/b6P1xq9). [And here is the version I want, modified as showed in the code](http://imgur.com/ipI7wBY). My [code](http://imgur.com/WDx9FZh) is OO. Please let me know if anything is unclear. ny_t should be replaced with new_t in the two last lines (translated variablenames from norwegian). self.Ihv.x[0] is the first x-value in the original dataset. My questions is now, is this "streched" version the way I should present the data, or is this something I am not allowed to do? – roggjah Oct 10 '16 at 07:58
  • I think the only problem I see is that you are treating the spectrum pins as if they are equally spaced though out the input data though there is exception at the start and at the end. To say the truth, it's very minute details that I hope is fine to ignore. Another advise, check how to change the color map to be more descriptives with proper gradient to show the changes in the values, it looks so flat. – hesham_EE Oct 10 '16 at 16:33
  • I made some adjustments, applying a logscale to the spectrum so I can better see the small changes. And the fft at the bottom of the [new figure](http://imgur.com/a/SfaNK) is done in the time domain shown with vertical black lines in the upper plot. My problem with the exceptions at the start and end is still unsolved. I try to look at the [source code](https://github.com/scipy/scipy/blob/master/scipy/signal/spectral.py), but I can not figure out how this is handled. Does the exception occur only at the end? Or are they equally "cut" at the start and end? – roggjah Oct 21 '16 at 10:17
  • The natural processes I study are on the nanosecond timescale. So this small adjustment is actually critical to my project. – roggjah Oct 21 '16 at 10:19
  • The code that I supplied shows you exactly where the time stamps come from. I'd say that you need to trim both data and spectrogram from both ends to get them matching. Notice that, the trimming from the beginning is due to the different step in the first `t`. So, both data and spectrogram need to be trimmed. The last time sample, I think, if there is no enough samples for a window, it'll not generate Sxx for it. So, there might be a need to trim only data at the end. – hesham_EE Oct 24 '16 at 03:07