Xgboost on Spark Validation Indicator Column and Evaluation Metric

Question

I am using the xgboost PySpark API. This API is experimental but it supports most of the features of the xgboost API.

As per the documentation below, eval_set parameter is not supported and instead, validationIndicatorCol parameter should be used.

https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.spark

https://databricks.github.io/spark-deep-learning/#module-sparkdl.xgboost

xgb = XgboostClassifier(featuresCol = "features", 
                        labelCol="label", 
                        num_workers = 40, 
                        random_state = 1,
                        missing = None, 
                        objective = 'binary:logistic',
                        validationIndicatorCol = 'isVal',
                        eval_metric = 'aucpr' ,
                        n_estimators = best_n_estimators, 
                        max_depth = best_max_depth, 
                        learning_rate = best_learning_rate       
                       )

 pipeline = Pipeline(stages=[vectorAssembler,xgb])
 pipelineModel = pipeline.fit(sampled_df)

It seems to be running without any errors which is great.

How do you print and look at the evaluation results? Traditional xgboost has evals_result() method which pipelineModel.stages[-1].evals_result() doesn't seem to work in the PySpark API. This method should normally work since the PySpark API documentation doesn't say otherwise. Any idea on how to make it work?

I am attempting to do something similar, except with LightGBM (whose PySpark interface largely mirrors that of XGBoost for PySpark). I will let you know what I find. One question for you: are you passing the 'isVal' column to the VectorAssembler as one of the inputCols, or no? — Greg Aponte, Feb 13 '23 at 20:07
Is there LightGBM for PySpark? I didn't know that. Yes, the isVal column should be fed into the Vector Assembler — Vusal, Mar 23 '23 at 16:23

Xgboost on Spark Validation Indicator Column and Evaluation Metric

0 Answers0