I am trying to save a grid-searched PySpark TrainValidationSplitModel
object, and while tuning the regularization of the logistic regression I'm getting the following strange error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-104-8e6b86f1e92c> in <module>
1 # Save model, or upload if already saved
2 if not os.path.isdir(drive_path + 'lr_2_model'):
----> 3 lr_2_model.save(drive_path + 'lr_2_model')
4 else:
5 lr_2_model = TrainValidationSplitModel.load(drive_path + 'lr_2_model')
5 frames
/content/spark-3.3.0-bin-hadoop3/python/pyspark/ml/tuning.py in meta_estimator_transfer_param_maps_to_java(pyEstimator, pyParamMaps)
324 break
325 if javaParam is None:
--> 326 raise ValueError("Resolve param in estimatorParamMaps failed: " + str(pyParam))
327 if isinstance(pyValue, Params) and hasattr(pyValue, "_to_java"):
328 javaValue = cast(JavaParams, pyValue)._to_java()
ValueError: Resolve param in estimatorParamMaps failed: LogisticRegression_87f4bc317e0b__regParam
This is the code that caused the error. This code worked with a previous LogisticRegression
PySpark model where I tuned the maxIter
parameter.
# Save model, or upload if already saved
if not os.path.isdir(drive_path + 'lr_2_model'):
lr_2_model.save(drive_path + 'lr_2_model')
else:
lr_2_model = TrainValidationSplitModel.load(drive_path + 'lr_2_model')
This is the code where I defined lr_2_model
(grid_search
is a custom function I wrote. The error can't be with that as it's been working with other models):
# Run grid search
%%time
if not os.path.isdir(drive_path + 'lr_2_model'):
lr_2_model = grid_search(stages_with_classifier=lr_2_stages,
train_df=train_df_preprocessed,
model_grid=lr_2_grid,
parallelism=5)
And this is the code where I defined lr_2_grid
, lr_2_stages
, and lr_2
.
lr_2 = LogisticRegression(
featuresCol='scaled_features',
labelCol='Anomalous',
weightCol='Weight',
standardization=False)
lr_2_stages = stages + [lr_2]
# Specify parameter grid
lr_2_grid = ParamGridBuilder()\
.addGrid(lr_1.regParam, list(np.linspace(0.001, 0.1, 5)))\
.build()