4

I'm new to scikit-learn but I'm trying to remove eye-blinks (noise peaks) inside a single EEG channel. I've searched over the internet but only see more complicated readings with MNE, PyEEG or other Python modules. I just want something simple and dependent only on sklearn. Here is my code:

#The channel containing some eye-blinks
X = f1ep1_data[:,[4]]

#run ICA on signal
ica = FastICA(n_components=2)
ica.fit(X)

#reconstruct signal with independent components
components = ica.fit_transform(X)
X_restored = ica.inverse_transform(components)

fig1 = plt.figure()
plt.subplot(3,1,1)
plt.title("Original signal")
plt.plot(f1ep1_timescale, X)

plt.subplot(3,1,2)
plt.title("Components")
plt.plot(f1ep1_timescale, components)

plt.subplot(3,1,3)
plt.title("Signal Reconstructed")
plt.plot(f1ep1_timescale, X_restored)
plt.draw()

Here is the result of the code

What I was expecting was a separation into two components, a cleaner EEG signal and eye-blinks. I can't figure it out what's the problem. Can somebody give a helping hand?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Pedro
  • 330
  • 2
  • 12

2 Answers2

8

Did you notice your "components" are exactly the original signal scaled and upside down? That is because you cannot get more components than signals.

You need to perform the following steps:

  1. feed all EEG channels into the ICA
  2. manually remove the components that contain eye blinks or other artifacts
  3. reconstruct using the inverse transform

Let's take a detailed look at step 2: why remove components manually? ICA does not know anything about eye blinks. It separates signals into components based on a statistical measure. If you are lucky, some of these components will look like eye blinks.

Ok so far, but the real problem is that the order of components is not defined. Run ICA and you may find that component 1 contains eye blinks. Run it again and they are in component 3. Again and they are in both components 2 and 5...

There is no way to know in advance which and how many components to remove. That is why you need to manually tell it to the algorithm each time it runs.

In code that would look something like this:

# Use all channels - they will contain eye blinks to varying degrees
X = f1ep1_data[:, :]

# run ICA on signal
ica = FastICA(n_components=x.shape[1])  # we want *all* the components
ica.fit(X)

# decompose signal into components
components = ica.fit_transform(X)

# plot components and ask user which components to remove
# ...
remove_indices = [0, 1, 3]  # pretend the user selected components 0, 1, and 3

# "remove" unwanted components by setting them to 0 - simplistic but gets the job done
components[:, remove_indices] = 0

#reconstruct signal
X_restored = ica.inverse_transform(components)

Chances are, you are not happy with the manual step. Tough luck :) In 2013 there was no robust automatic algorithm that would mark eye-blink components. I don't think that has changed, but if there is something you will find it one of the domain specific packages like MNE or PyEEG.


If you happen to have EOG recordings there is hope, though! There exists A fully automated correction method of EOG artifacts in EEG recordings. That approach is based on canonical correlation or regression (I don't remember the details), but you need to have EOG signals recorded along with the EEG.


I created a working example with simulated "EEG" data. The data consists of three channels: frontal, central, and parietal. A 10 Hz alpha activity is strongest in the back, and a few blink-like spikes are strongest in the front.

Hopefully, this example better illustrates how to remove components from multi-channel data.

import numpy as np
import scipy.signal as sps
from sklearn.decomposition import FastICA
import matplotlib.pyplot as plt

np.random.seed(42)

n = 1000
fs = 100
noise = 3

# simulate EEG data with eye blinks
t = np.arange(n)
alpha = np.abs(np.sin(10 * t / fs)) - 0.5
alpha[n//2:] = 0
blink = np.zeros(n)
blink[n//2::200] += -1
blink = sps.lfilter(*sps.butter(2, [1*2/fs, 10*2/fs], 'bandpass'), blink)

frontal = blink * 200 + alpha * 10 + np.random.randn(n) * noise
central = blink * 100 + alpha * 15 + np.random.randn(n) * noise
parietal = blink * 10 + alpha * 25 + np.random.randn(n) * noise

eeg = np.stack([frontal, central, parietal]).T  # shape = (100, 3)

# plot original data
plt.subplot(3, 1, 1)
plt.plot(frontal + 50)
plt.plot(central + 100)
plt.plot(parietal + 150)
plt.yticks([50, 100, 150], ['Fz', 'Cz', 'Pz'])
plt.ylabel('original data')

# decompose EEG and plot components
ica = FastICA(n_components=3)
ica.fit(eeg)
components = ica.transform(eeg)

plt.subplot(3, 1, 2)
plt.plot([[np.nan, np.nan, np.nan]])  # advance the color cycler to give the components a different color :)
plt.plot(components + [0.5, 1.0, 1.5])
plt.yticks([0.5, 1.0, 1.5], ['0', '1', '2'])
plt.ylabel('components')

# looks like component #2 (brown) contains the eye blinks
# let's remove them (hard coded)!
components[:, 2] = 0

# reconstruct EEG without blinks
restored = ica.inverse_transform(components)

plt.subplot(3, 1, 3)
plt.plot(restored + [50, 100, 150])
plt.yticks([50, 100, 150], ['Fz', 'Cz', 'Pz'])
plt.ylabel('cleaned data')

enter image description here

MB-F
  • 22,770
  • 4
  • 61
  • 116
  • 1
    Why do I need to feed _all_ channels into the ICA? What if I'm using only one channel? ICA will not work? – Pedro Nov 06 '18 at 13:14
  • Even manually selecting the eye blinks I'm only getting a `(151L,1L) float64` from `ica_data = ica.fit_transform(blink)`. I guess I'm not using the ICA function as it is supposed to be used... – Pedro Nov 06 '18 at 13:17
  • 1
    The ICA transforms *N* channels into at most *N* components. if *N*=1 channel -> 1 component; that is not very useful. – MB-F Nov 06 '18 at 13:34
  • @Pedro I have updated the answer with a concrete example that you can play with. It sohuld demonstrate how the ICA decomposition works. – MB-F Nov 06 '18 at 14:10
  • This is fantastic @kazemakase! Thank you so much for the updated answer and N-N ICA rule. I was able to do it with 4 channels. A couple more questions: `1)` everytime the code runs, the eyeblinks component might be a different one. How to know which one is it? Maybe a convolution with a blink template? What would you suggest? `2)` what would be a better alternative for hard code the eyeblinks component? – Pedro Nov 06 '18 at 17:14
  • **1)** you don't know, that's why I originally proposed the manual "ask the user step". So you run the algorithm, then a dialog pops up that says *please select the blink component* (you would have to find a way to code that), and then the algorithm continues. If you have a blink template, fine - use convolution and a threshold; it might work. **2)** I would record eye blinks. Place a spare EEG electrode near an eye and record it along with the EEG. Then remove the component that correlates most with that channel. – MB-F Nov 07 '18 at 09:31
  • I see I still have lots of work to do... You're an expert on EEG, right? How did you simulate the eyeblink waveform (didn't get the butterworth filter usage)? This waveform could be used to fit the signal. – Pedro Nov 07 '18 at 19:59
  • 1
    I'm no longer working in that field. As for the filter, I used Butterworth because it was the first function that came to mind. All I needed was a band-pass to apply on impulse spikes. Then I played with filter frequencies and order until it looked kinda right ;) – MB-F Nov 08 '18 at 08:19
0

In case you are dealing with a single channel EEG the following could be an easy-to-implement method:

1) Detect blinks in the signal x using simple threshold-based peak detection (you may have to figure it out by looking into your signals for a few blink instances). Devices from Neurosky, Muse and so on, comes with API's to detect blinks. You may use that if required.

2) Take the frame corresponding to the blink (xb). Fit a smooth line on it (xbs).

3) Subtract the smooth line (xbs) from the blink (xb) and add the mean of the signal to this. Lets call this as xbc.

4) Replace xbc in place of xb in the original signal x.

user3284005
  • 591
  • 4
  • 4