1

I am trying to tune my xgBoost model on Spark using Scala. My XGb parameter grid is as follows:

val xgbParamGrid = (new ParamGridBuilder()
                .addGrid(xgb.maxDepth, Array(8, 16))
                .addGrid(xgb.minChildWeight, Array(0.5, 1, 2))
                .addGrid(xgb.alpha, Array(0.8, 0.9, 1))
                .addGrid(xgb.lambda, Array(0.8, 1, 2))
                .addGrid(xgb.scalePosWeight, Array(1, 5, 9))
                .addGrid(xgb.subSample, Array(0.5, 0.8, 1))
                .addGrid(xgb.eta, Array(0.01, 0.1, 0.3, 0.5))
                .build())

The call to the cross validator is as follows:

val evaluator = (new BinaryClassificationEvaluator()
                      .setLabelCol("label")
                      .setRawPredictionCol("prediction")
                      .setMetricName("areaUnderPR"))

    val cv = (new CrossValidator()
              .setEstimator(pipeline_model_xgb)
              .setEvaluator(evaluator)
              .setEstimatorParamMaps(xgbParamGrid)
              .setNumFolds(10))

    val xgb_model = cv.fit(train)

I am getting the following error just for the scalePosWeight parameter:

error: type mismatch;
found   : org.apache.spark.ml.param.DoubleParam
required: org.apache.spark.ml.param.Param[AnyVal]
Note: Double <: AnyVal (and org.apache.spark.ml.param.DoubleParam <:                      

    org.apache.spark.ml.param.Param[Double]), but class Param is invariant in type T.
You may wish to define T as +T instead. (SLS 4.5)
                              .addGrid(xgb.scalePosWeight, Array(1, 5, 9))
                                           ^

Based on my search, the message "You may wish to define T as +T instead" is common but I am not sure how to fix this here. Thanks for your help!

Arjun Mishra
  • 593
  • 1
  • 5
  • 8
  • I have no idea about scala, but could it be due to xgboost expecting a float type as an input to that parameter? – Mischa Lisovyi Jun 29 '18 at 08:35
  • I thought so too but that beats the purpose of parameter selection if it can only take a single value, don't you think? – Arjun Mishra Jun 29 '18 at 18:02
  • i totally agree, but as said, I have no idea about scala and other posts on SO with the same error message seem to talk about type mismatch – Mischa Lisovyi Jun 29 '18 at 20:34

1 Answers1

1

I run into the same issue when setting the Array for minChildWeight and the array was composed by Int types only. The solution that worked (for both scalePosWeight and minChildWeight) is to pass an Array of Floats:

.addGrid(xgb.scalePosWeight, Array(1.0, 5.0, 9.0))
datapug
  • 2,261
  • 1
  • 17
  • 33