2

I'm following the answer to this question and this scikit-learn tutorial to remove artifacts from an EEG signal. They seem simple enough, and I'm surely missing something obvious here.

The components extracted don't have the same length as my signal. I have 88 channels of several hours of recordings, so the shape of my signal matrix is (88, 8088516). Yet the output of ICA is (88, 88). In addition to being so short, each component seems to capture very large, noisy-looking deflections (so out of 88 components only a couple actually look like signal, the rest look like noise). I also would have expected only a few components to look noisy. I suspect I'm doing something wrong here?

The matrix of (channels x samples) has shape (88, 8088516).

enter image description here

Sample code (just using a random matrix for minimum working purposes):

import numpy as np
from sklearn.decomposition import FastICA
import matplotlib.pyplot as plt

samples_matrix = np.random.random((88, 8088516))

# Compute ICA
ica = FastICA(n_components=samples_matrix.shape[0])  # Extracting as many components as there are channels, i.e. 88
components = ica.fit_transform(samples_matrix)  # Reconstruct signals
A_ = ica.mixing_  # Get estimated mixing matrix

The shape of components is (88, 88). One plotted looks like this:

plt.plot(components[1])

enter image description here

I would have expected that the components be time series of the same length as my original, as shown in the answer to this question. I'm really not sure how to move forward with component removal and signal reconstruction at this point.

  • Could you link the tutorials you've been following? – Abhigyan Mar 23 '22 at 08:13
  • Also could you provide a minimum reproducible example with some of the data you are using? – Abhigyan Mar 23 '22 at 08:15
  • Added both, thanks! Unfortunately the data file is huge, but I think a randomly generated matrix of the same size illustrates the point as well. If this can be done better without linking the actual data please let me know! – neverreally Mar 23 '22 at 09:51
  • In `sklearn`, a dataset is typically represented by a sample occupying a single *row*. I.e., I would expect your dataset to have shape (8088516, 88). – rickhg12hs Mar 23 '22 at 14:54

1 Answers1

2

You need to run the fit_transform on the transpose of your samples_matrix instead of the samples_matrix itself (so provide a 8088516 x 88 matrix instead of an 88x8088516 to the method).

ica.fit_transform(samples_matrix.transpose())

or more simply

ica.fit_transform(samples_matrix.T)

which will give you an 8088516 x 88 set of signals (88 components, each of a signal as long as the original) for plotting. As I mentioned below in my comment, due to the large matrix inversions, etc., my setup recommended no more than 64 components.

To back this suggestion up, I'm looking at your tutorial, they set up their toy problem as follows:

n_samples = 2000
time = np.linspace(0, 8, n_samples)

s1 = np.sin(2 * time)  # Signal 1 : sinusoidal signal
s2 = np.sign(np.sin(3 * time))  # Signal 2 : square signal
s3 = signal.sawtooth(2 * np.pi * time)  # Signal 3: saw tooth signal

S = np.c_[s1, s2, s3]
S += 0.2 * np.random.normal(size=S.shape)  # Add noise

S /= S.std(axis=0)  # Standardize data
# Mix data
A = np.array([[1, 1, 1], [0.5, 2, 1.0], [1.5, 1.0, 2.0]])  # Mixing matrix
X = np.dot(S, A.T)  # Generate observations

which gives an X.shape of (2000,3) to separate the 3 components, indicating that this is the preferred format of the matrix for the FastICA method.

jonsca
  • 10,218
  • 26
  • 54
  • 62
  • how long does it run for you? Mine has been running for over an hour (and the only change made was adding the .transpose()), running on a V100 (and down to 5 components just as a starting point). I wasn't sure if that's to be expected or if I'm running into another issue with my compute – neverreally Mar 23 '22 at 11:52
  • Can you also explain why it needs to be transposed? – neverreally Mar 23 '22 at 11:52
  • It's going to take quite a while. You have a lot of datapoints! You've essentially asked it to produce 88 components from a time series that's 88 points long but with 8 million channels, whereas you want to actually produce 88 components out of 88 channels with 8 million samples. You can see why your convergence times were much faster in the former case! – jonsca Mar 23 '22 at 12:03
  • I don't know if your data are from a situation just with a subject resting or if it's ERPs. Either way, segmenting it into smaller sections would be helpful if you're looking for a relevant change. – jonsca Mar 23 '22 at 12:06
  • 1
    I didn't know what a V100 was at first, but Scikit-learn only uses CPU. https://scikit-learn.org/stable/faq.html#will-you-add-gpu-support – jonsca Mar 23 '22 at 12:24
  • 1
    Since you have an actual TPU, try out the TensorFlow implementation of FastICA here https://medium.com/analytics-vidhya/fast-ica-vs-reconstruction-ica-vs-orthonormal-ica-in-tensorflow-matlab-manual-back-prop-in-tf-8b3212924ad0. It will definitely scream on that hardware. – jonsca Mar 23 '22 at 12:36