How to resolve error getting while saving xgboost model in azure databricks?

Question

I am trying to save model in azure databricks but getting error - "It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063."

Following is the code which I am using :-

import dill as dill
with open('model.sav', "wb") as f:
    dill.dump(model,f)

or is there any other way to download model to local without using mleap or mlflow?

I am using this code totrain model.

pipe = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())])

pipe_xgb = make_pipeline(pipe,DataFrameScaler(),xgb.XGBClassifier(objective='binary:logistic'))
param_grid_classifier = {
                        'xgbclassifier__n_estimators':[100,200], 
                         'xgbclassifier__learning_rate' :[0.1,0.01],
                         'xgbclassifier__colsample_bytree':[0.3,0.5],
                         'xgbclassifier__max_depth': [4],
                        'xgbclassifier__reg_lambda':[0.01,0.1],
                         'xgbclassifier__n_jobs':[4]
                        }
metric = 'average_precision'
grid_search1 = GridSearchCV(pipe_xgb, param_grid=param_grid_classifier,scoring=metric, cv=2)

gridmodel1 = grid_search1.fit(X_train, y_train_label)
model = gridmodel1.best_estimator_

Thanks in advance for help.

can you show the bigger portion of code? it's not clear what is wrong with your code — Alex Ott, May 26 '21 at 06:37

How to resolve error getting while saving xgboost model in azure databricks?

0 Answers0