I have a severe class imbalance where positive response is about 3%. The 3% absolute volume is about ~6000 rows. I'm currently using sparklyr and MLibs algorithms. Some of the native Databricks MLibs has class weight imbalance as a parameter. Is that available in sparklyr? I'm currently using ml_random_forest_classifier as the algorithm to classify a dichotomous outcome. thanks.
Reproducible codes are here. Sparklyr Spark ML Feature Importance after feature transformation