I am training a Random Forest model in Spark 2.3 using a StringIndexer, OneHotEncoderEstimator and a RandomForestRegressor. Like this:
//Indexer
val stringIndexers = categoricalColumns.map { colName =>
new StringIndexer()
.setInputCol(colName)
.setOutputCol(colName + "Idx")
.setHandleInvalid("keep")
.fit(training)
}
//HotEncoder
val encoders = featuresEnconding.map { colName =>
new OneHotEncoderEstimator()
.setInputCols(Array(colName + "Idx"))
.setOutputCols(Array(colName + "Enc"))
.setHandleInvalid("keep")
}
//Adding features into a feature vector column
val assembler = new VectorAssembler()
.setInputCols(featureColumns)
.setOutputCol("features")
val rf = new RandomForestRegressor()
.setLabelCol("label")
.setFeaturesCol("features")
.setMaxBins(1000)
val stepsRF = stringIndexers ++ encoders ++ Array(assembler, rf)
val pipelineRF = new Pipeline().setStages(stepsRF)
val paramGridRF = new ParamGridBuilder()
.addGrid(rf.minInstancesPerNode, Array(1, 5, 15))
.addGrid(rf.maxDepth, Array(10, 11, 12))
.addGrid(rf.numTrees, Array(20, 50, 100))
.build()
//Defining the evaluator
val evaluatorRF = new RegressionEvaluator()
.setLabelCol("label")
.setPredictionCol("prediction")
//Using cross validation to train the model
val cvRF = new CrossValidator()
.setEstimator(pipelineRF)
.setEvaluator(evaluatorRF)
.setEstimatorParamMaps(paramGridRF)
.setNumFolds(10)
.setParallelism(3)
//Fitting the model with our training dataset
val cvRFModel = cvRF.fit(training)
I am not sure what are the best combination of parameters for this model, so I added the following Grid of Parameters:
.addGrid(rf.minInstancesPerNode, Array(1, 5, 15))
.addGrid(rf.maxDepth, Array(10, 11, 12))
.addGrid(rf.numTrees, Array(20, 50, 100))
And I let the CrossValidator to calculate the best combination. Now What I would like is to find out which combination it picked up, to keep tunning the model from there. So I was trying to get this parameters like this:
cvRFModel.bestModel.extractParamMap
But I am getting an empty map:
org.apache.spark.ml.param.ParamMap =
{
}
What am I missing?