2

I am trying to deal with imbalanced data set using imblearn's random under-sampler. I want to specify the number of labels to be under-sampled manually. Here is my code:

sm = RandomUnderSampler(ratio = {0:142498, 1: 495}, random_state=42)
X_train, y_train = sm.fit_sample(X_tr,encoded_Ytrain)
print(format(Counter(y_train)))

However, this throws the error:

File "first_approach.py", line 56, in < module > X_train, y_train = sm.fit_sample(X_tr,encoded_Ytrain) raise ValueError('Unknown parameter type for ratio.') ValueError: Unknown parameter type for ratio.

What should be the correct syntax for passing the same?

sophros
  • 14,672
  • 11
  • 46
  • 75
Saurav--
  • 1,530
  • 2
  • 15
  • 33

2 Answers2

1

depending on the version you're using, instead of "ratio" you have to use "sampling_strategy" when you're using a dict.

ramobal
  • 241
  • 2
  • 9
0

Try installing version 0.3

imblearn 0.2.1 does not support the dictionary. You will need to install it from the source.

pip install -U git+https://github.com/scikit-learn-contrib/imbalanced-learn.git
Kai
  • 2,529
  • 1
  • 15
  • 24
shreyy
  • 5
  • 4