1

I am developing a model using Random Forest in R. The data has 2000 obs x 20 features. The target class that I am trying to classify has 6 levels. All the variables are categorical in nature.

The target is skewed towards one class constitutes over 65% of the observation. Remaining 35% is distributed amongst the other five target classes. Distribution is as below

Class A       Class B       Class C    Class D     Class E        Class F       Class G
0.660185185    0.002314815  0.0027777  0.0722222   0.181944444    0.013425926   0.067129630 

I am trying to using ROSE or SMOTE to balance the data set, but getting an error that they work only on binary classifiers.

Is there a library available in R to balance multiclass data sets. Right now the accuracy on the model is very less (around 64%). I am hoping that balancing the data sets might improve the accuracy.

Any help in this matter will be appreciated.

cheers -Nitin

TimoStaudinger
  • 41,396
  • 16
  • 88
  • 94
Nitin
  • 11
  • 1

0 Answers0