2

I have tried two ways to apply SMOTE function to my dataset. However, I can't figured out how to proceed with the Smote function. 1st method: I have applied data augmentation and then tried to apply SMOTE


train_data_gen = ImageDataGenerator(
        rescale=1./255,
        zoom_range=0.1,
        horizontal_flip=True)
train_g = train_data_gen.flow_from_directory(
    data_train,
    target_size=(img_height, img_width),
    color_mode = "grayscale",
    batch_size=batch_size,
    class_mode = "sparse"
)
for data, labels in train_g:
  label = labels
sm = SMOTE(random_state=42)
train_smote,train_labels = sm.fit_resample(train_g,label)

I have tried the above code but it is taking way too long and didnt give any output.

Second method:

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_train,
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

for data, labels in train_ds:
  label = labels

sm = SMOTE(random_state=42)
train_smote,train_labels = sm.fit_resample(train_ds,label)

This is the error i get for the second method

Traceback (most recent call last):
  File "trainmvlp.py", line 92, in <module>
    train_smote,train_labels = sm.fit_resample(train_ds,label)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\imblearn\base.py", line 77, in fit_resample
    X, y, binarize_y = self._check_X_y(X, y)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\imblearn\base.py", line 130, in _check_X_y
    X, y = self._validate_data(X, y, reset=True, accept_sparse=accept_sparse)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\sklearn\base.py", line 433, in _validate_data
    X, y = check_X_y(X, y, **check_params)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\sklearn\utils\validation.py", line 871, in check_X_y
    X = check_array(X, accept_sparse=accept_sparse,
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "C:\Users\User\Anaconda3\envs\gait\lib\site-packages\sklearn\utils\validation.py", line 687, in check_array
    raise ValueError(
ValueError: Expected 2D array, got scalar array instead:
array=<PrefetchDataset shapes: ((None, 64, 64, 3), (None,)), types: (tf.float32, tf.int32)>.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

I have tried to reshape it but it still shows the same error. Could anyone tell me what am i doing wrong? Thank you in advance.

Jenny
  • 21
  • 1

0 Answers0