0

I am working on a machine learning model(classification) where my dataset is imbalanced and i want to balance it by using oversample() function from 'imbalance' package in R.

Below are the codes used for oversampling where 'Final.Status' is my response variable and it's a factor data type.

training <- na.omit(training)

training.oversamp <- oversample(training,method = "SMOTE",classAttr = 'Final.Status')

But while doing it i am getting below error:

Error in dataset[, classAttr] == c : 
  comparison of these types is not implemented
In addition: Warning message:
In which(dataset[, classAttr] == c) :
  Incompatible methods ("Ops.data.frame", "Ops.factor") for "=="

Also out of curiosity can anyone brief different methods used in oversample() function and which one is commonly used.

carlo_sguera
  • 395
  • 2
  • 14
Nick
  • 333
  • 5
  • 17
  • How many levels does `Final.Status` have? Morever, I think you need to define `ratio` in `oversample`. By default it is `NA`. – carlo_sguera Jul 21 '20 at 12:06
  • There are two levels,what is the purpose of ratio? However i tried by using ratio but still i am getting same error. – Nick Jul 21 '20 at 12:17
  • From `help(oversample)`: `ratio` - Number between 0 and 1 indicating the desired ratio between minority examples and majority ones, that is, the quotient size of minority class/size of majority class. There are methods, such as ADASYN or wRACOG to which this parameter does not apply. – carlo_sguera Jul 21 '20 at 12:23
  • Can you post the result of `str(training)`? – carlo_sguera Jul 21 '20 at 12:26
  • @carlo_sguera i figured it out using different function named `upSample()` from `caret` package. – Nick Jul 21 '20 at 12:48

1 Answers1

0

I had the exact same error and my solution was to transform the data set from tibble to data.frame.

In your case it could be as follows:

training.oversamp <- oversample(as.data.frame(training),method = "SMOTE",classAttr = 'Final.Status')

AugtPelle
  • 549
  • 1
  • 10