2

My dataset has 2000 attributes and 200 samples. I need to reduce the dimensionality of it. To do this, I am trying to use Fourier transformation as a dimensional reduction. Fourier transformation returns the discrete Fourier transform when I feed data as an input. But I do not know how to use it for dimensional reduction.

from scipy.fftpack import fft
import panda as pd

price = pd.read_csv(priceFile(), sep=",")
transformed = fft(price )

Can you please help me?

user3104352
  • 1,100
  • 1
  • 16
  • 34
  • 1
    This has very little to do with programming and even less to do with Python. But generally, to reduce the dimensionality by an fft requires the data to have certain properties, such as strong low frequency components (i.e. slowly changing between samples). It that is the case, it is possible to, after the transform, only keep a limited number of fft bins, while the others, that are close to zero, are dropped. – JohanL Oct 11 '18 at 18:48
  • Try PCA instead. – Cris Luengo Nov 11 '18 at 15:54

2 Answers2

1

Fourier transform is most suited if your samples are each a time series. If they are you may extract frequency domain features for each sample from transformed. Here is a listing of common features in time and frequency domain that you can consider (reference):

enter image description here

Reveille
  • 4,359
  • 3
  • 23
  • 46
1

Let's said you have a Pandas data frame with 2000 atributes and 200 samples as you mentioned:

import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(200, 2000)))

To reduce the dimensionality using scipy, you can generate a new an array with the transformed values by first setting the number of dimensions (n_dimensions) that you want and the calling the scipy function (fft).

First we call the function and we define it as fft

from scipy.fftpack import fft    

Then we set the number of dimensions in this case we will assign 1 dimension

n_dimensions = 1

Then we call the function and we add our data frame first and the number of dimensions.

transformed_data = fft(df,n=n_dimensions)

Then if we want to work with Real numbers you can transform the array

df = df.real