0

I am running SMOTE function as given below :

# install.packages("DMwR") for SMOTE implementation 

library(DMwR) smoted_data <- SMOTE(state~., deliq, perc.over=200, perc.under = 1600)

But i am getting below error :

Error in factor(newCases[, a], levels = 1:nlevels(data[, a]), labels = levels(data[, : invalid 'labels'; length 0 should be 1 or 2 In addition: Warning message: In smote.exs(data[minExs, ], ncol(data), perc.over, k) : NAs introduced by coercion

I checked all factor variables and none of them contains 0 at any level.

No NA's are available in the data as well, i checked all related posts in stackoverflow but did not get anything relevant to my case.

What are possible reasons for this?

SKB
  • 189
  • 1
  • 13

2 Answers2

1

The workaround I found was to do something like the following (keep only the predictors and the target variable).

x <- your independent variables

  fmla <- as.formula(paste("state ~ ", paste(x, collapse= "+")))

smoted_data <- SMOTE(fmla, subset(deliq, select = c("state",x)), perc.over=200,perc.under = 1600)
M.K
  • 1,464
  • 2
  • 24
  • 46
Fan XIong
  • 11
  • 1
0

Check that those factor(s) variable are truly in factor format and not character(s). Do this to check: str(mydata) If any of them are in character, put them back in factor format using this: as.factor(myvariable)