1

I am working in a multi-label prediction task where the label is encoded as one-hot encoded vector such as [1, 0, 0] or [0, 1, 0] or [0, 0, 1] of type ndarray.

The dataset is imbalanced. Hence, I am using SMOTE. This works and upsamples all minority classes (it upsamples as many records as the majority class holds).

Now, I want to upsample not as many records. According to the documentation, I can use sampling_strategy and provide a dict with key = class label and value = total records.

However, I cannot add the ndarray as key to my dict (TypeError: unhashable type: 'numpy.ndarray'). What is the best way here? SMOTE can obviously handle these one-hot encoded vectors -- so how do I get the total records in there?

Janothan
  • 446
  • 4
  • 16

0 Answers0