computing Fast Fourier Transform of dataset using python

Question

I want to calculate the fft of a given signal using python. The x axis is time (seconds) and the y axis is a voltage. The signal has some kind of periodicity and looks like this:

Following this post, I get this figure:

Is this the correct fft?. The csv file is here. And the code:

import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
from scipy.fft import fft

plt.rcParams['figure.dpi'] = 1000

# load the dataset #1
dataframe = read_csv('data/1.csv', usecols=[1])

plt.plot(dataframe)
plt.show()

################ FFT Con scipy
#number of sample points
N = 100
#sampling period
T = 1
#create x-axis for time length of signal
x = np.linspace(0, N*T, N)
#create array that corresponds to values in signal
y = dataframe
y = y - np.mean(y)
#perform FFT on signal
yf = fft(y)
#create new x-axis: frequency from signal
xf = np.linspace(0.0, 1.0/(2.0*T), N//2)
#plot results
plt.plot(xf, abs(yf[0:N//2]), label = 'signal')
plt.grid()
plt.xlabel('Frequency')
plt.ylabel('Spectral Amplitude')
plt.legend(loc=1)
plt.savefig('fft.jpg')
plt.show()

`Is this the correct fft?.` What do you mean specifically? `scipy.fft.fft` returns the correct fft of its input. — FlyingTeller, Jul 20 '23 at 11:12
@FlyingTeller I am not sure if the sampling period of 1 is correct. Also, If I increase a lot the number of samples (for example, to 2000), the figure changes completely. I guess that my problem is that I don't understand well how an fft of a non-sin signal looks like. — bardulia, Jul 20 '23 at 11:21
Is T=1 really the sampling period of the data in that file? What is the real inter-sample time in seconds? — Reinderien, Jul 20 '23 at 11:51
The FFT looks reasonable: signal increases its frequency, for higher frequencies only the samples nearer the amplitude are visible, therefore the increasing amplitude with frequency on the FFT graph. You can try applying window function to the signal before processing it, especially when full samples are not aligned to full period of the sampled signal. — alagner, Jul 20 '23 at 12:01
@Reinderien the inter-sample time is 2 seconds, sorry that I did not mention that. Does that mean that I should set T = 2? — bardulia, Jul 20 '23 at 12:28
@alagner thanks for your analysis. What do you mean exactly by window function?. on the other hand, why N = 100 is suitable for this case? If I set a higher value (like 1000), I get strange results. Is this related with Nyquist frecuency?. — bardulia, Jul 20 '23 at 12:31
https://en.wikipedia.org/wiki/Window_function I guess the part you should be particularly interested in is spectral leakage. As for number of samples: it's not the number, it's how close to N full periods of the signal the samples contain (with N being a positive integer). Compare the FFTs of 1Hz sine wave sampled with Fs=100Hz during 1 second, 2 seconds and 1.75 seconds and you'll see what I mean (numbers from the top of my head, but they should do the trick). — alagner, Jul 20 '23 at 12:45

score 1 · Accepted Answer · answered Jul 20 '23 at 12:47

1

The FFT is correct but how you display it is misleading. You need to use your actual sample period:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('1.csv')

# The horizontal index must be linear
assert np.all(df.epoch.diff()[1:] == 1)

yf = np.fft.rfft(df.voltage, norm='forward')
ff = np.fft.rfftfreq(n=len(df), d=2)

fig, ax = plt.subplots()
ax.loglog(ff, np.abs(yf))
ax.set_xlabel('Frequency (Hz)')
ax.set_ylabel('Amplitude (V)')
plt.show()

answered Jul 20 '23 at 12:47

Reinderien

11,755
5
49
77

Why using log scaling on both axis? – bardulia Jul 20 '23 at 13:46
1

Using a linear scale obscures most of the interesting detail in your data – Reinderien Jul 20 '23 at 14:06

computing Fast Fourier Transform of dataset using python

1 Answers1