I am developing a model using Random Forest in R. The data has 2000 obs x 20 features. The target class that I am trying to classify has 6 levels. All the variables are categorical in nature.
The target is skewed towards one class constitutes over 65% of the observation. Remaining 35% is distributed amongst the other five target classes. Distribution is as below
Class A Class B Class C Class D Class E Class F Class G
0.660185185 0.002314815 0.0027777 0.0722222 0.181944444 0.013425926 0.067129630
I am trying to using ROSE or SMOTE to balance the data set, but getting an error that they work only on binary classifiers.
Is there a library available in R to balance multiclass data sets. Right now the accuracy on the model is very less (around 64%). I am hoping that balancing the data sets might improve the accuracy.
Any help in this matter will be appreciated.
cheers -Nitin