I'm working with a large amount of data that comes in daily at 32hz. We want to filter the data to .5hz(Edit: Question originally specified 1hz, changed based on suggestion) as well as down sample it to 1hz. I am using signal.resample for down sampling and signal.filtfilt with a signal.butter filter. However after performing both of these the FFT shows the signal fall off around 0.16hz.
Does the order that you filter than downsample matter? Is it something to do with the procedures? Am I not understanding the methods right?
I included the data that I think is relevant
desiredFreq = 1 * 60 *60 *26
# untouched data
quickPlots(x,32) # 32 hz
# filtered
x = ft.butterLowPass(x, .5, 32)
quickPlots(x,32) # 32 hz
# resampled
x = ft.resampleData(x, desiredFreq)
quickPlots(x,1) # should be 1 hz
def butterLowPass(data,desired,original,order = 5):
""" Creates the coefficients for a low pass butterworth
filter. That can change data from its original sample rate
to a desired sample rate
Args:
----
data (np.array): data to be filtered
desirerd (int): the cutoff point for data in hz
original (int): the initial sampling rate in hz
Returns:
-------
np.array: of the data after it has been filtered
Note:
----
I find that the filter will make the data all 0's if the order
is too high. Not sure if this is my end or scipy.signal. So work with
CAUTION
References:
----------
https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.signal.butter.html
https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.signal.filtfilt.html
https://en.wikipedia.org/wiki/Butterworth_filter
https://en.wikipedia.org/wiki/Low-pass_filter
https://en.wikipedia.org/wiki/Nyquist_rate
"""
nyq = .5 * original
cutoff = desired / nyq
b, a = signal.butter(order, cutoff, btype = 'lowpass', analog = False)
return signal.filtfilt(b, a, data, padlen =10)
def resampleData(data, desired):
""" Takes in a set of data and resamples the data at the
desired frequency.
Args:
----
data (np.array): data to be operated on
desired (int): the desired sampled points
over the dataset
Returns:
-------
np.array: of the resamples data
Notes:
-----
Since signal.resample works MUCH faster when the len of
data is a power of 2, we pad the data at the end with null values
and then discard them after.
"""
nOld = len(data)
thispow2 = np.floor(np.log2(nOld))
nextpow2 = np.power(2, thispow2+1)
data = np.lib.pad(
data,
(0,int(nextpow2-nOld)),
mode='constant',
constant_values = (0,0)
)
nNew = len(data)
resample = int(np.ceil(desired*nNew/nOld))
data = signal.resample(data, resample)
data = data[:desired]
return data
def quickPlots(data,fs):
import matplotlib as mpl
import matplotlib.pyplot as plt
from scipy import signal
import time
fft,pxx=signal.welch(data,fs)
plt.plot(data)
plt.show()
plt.loglog(fft,pxx)
plt.show()
Pictures of FFT:
Raw data(4hz spike due to other frequencies leaking in):
Edit: After adjusting to filter at .5hz the same problem appears. I am now wondering if it is a problem with how I am displaying my FFT. I've included the quick plotting I was using to display the graphs.