0

I have a severe class imbalance where positive response is about 3%. The 3% absolute volume is about ~6000 rows. I'm currently using sparklyr and MLibs algorithms. Some of the native Databricks MLibs has class weight imbalance as a parameter. Is that available in sparklyr? I'm currently using ml_random_forest_classifier as the algorithm to classify a dichotomous outcome. thanks.

https://docs.databricks.com/machine-learning/automl/how-automl-works.html#imbalanced-dataset-support-for-classification-problems

Reproducible codes are here. Sparklyr Spark ML Feature Importance after feature transformation

Choc_waffles
  • 518
  • 1
  • 4
  • 15

0 Answers0