2

For one of my datasets, I have a data imbalance problem as the minority class has very few samples compared to the majority class. So I want to balance the data by undersampling the majority class. When I am trying to use RandomUnderSamples from imblearn package on a 3D array and I have an error

ValueError: Found array with dim 3. Estimator expected <= 2.

The features in the data which are in 3D format

train['X'].shape
(276216, 101, 4)

The input labels

train['y'].shape
(276216, 1)

When I try to randomly undersample data when I run this

from imblearn.under_sampling import RandomUnderSampler
undersample = RandomUnderSampler(sampling_strategy='majority')

X_train_under, y_train_under = undersample.fit(train['X'], train['y'])

I get the above error. Any help would be appreciated.

upendra
  • 2,141
  • 9
  • 39
  • 64
  • 2
    1. Could you please post the full StackTrace? 2. Can you reproduce the error with a minimal example that contains everything necessary for us to reproduce it ourselves with the code you provide? – MangoNrFive Nov 05 '22 at 05:56

1 Answers1

3

The function expects 2D arrays to be passed as arguments. Reshape your data and you'll be fine. Also, you will have to call fit_resample as per docs.

X = train['X'].reshape(train['X'].shape[0], -1) 
X_train_under, y_train_under = undersample.fit_resample(X, train['y'])
burglarhobbit
  • 1,860
  • 2
  • 17
  • 32