0

Trying to run an Optuna study that has a function with a Pipeline in. I, kind of, understand the error but have no idea what the solution is...

Trying to run the following code... It works fine when running XGBClassifier on preprocessed data that doesn't need to run through a pipeline..

def objective(trial):
    """Define the objective function"""

    params = {
        'max_depth': trial.suggest_int('max_depth', 1, 9),
        'learning_rate': trial.suggest_loguniform('learning_rate', 0.01, 1.0),
        'n_estimators': trial.suggest_int('n_estimators', 50, 500),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
        'gamma': trial.suggest_loguniform('gamma', 1e-8, 1.0),
        'subsample': trial.suggest_loguniform('subsample', 0.01, 1.0),
        'colsample_bytree': trial.suggest_loguniform('colsample_bytree', 0.01, 1.0),
        'reg_alpha': trial.suggest_loguniform('reg_alpha', 1e-8, 1.0),
        'reg_lambda': trial.suggest_loguniform('reg_lambda', 1e-8, 1.0),
        'eval_metric': 'mlogloss',
        'use_label_encoder': False
    }
    
    # Define model
    xgbmodel = XGBClassifier(random_state = 1)

    # Bundle preprocessing and modeling code in a pipeline
    xgb_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
                            # ('skb', SelectKBest(chi2, k = 10)),
                           ('xgbmodel', xgbmodel)
                           ])

    # Fit the random search model
    start_time = timer(None) # timing startes from this point for "start_time" variable

    # Fit the model
    optuna_model = xgb_pipeline(**params)
    optuna_model.fit(X_train, y_train)

    # Make predictions
    y_pred = optuna_model.predict(X_valid)

    # Evaluate predictions
    accuracy = accuracy_score(y_valid, y_pred)
    return accuracy
study = optuna.create_study(direction='maximize') 
study.optimize(objective, n_trials=100)

Get an error that starts..

[W 2023-01-11 19:30:05,914] Trial 2 failed because of the following error: TypeError("'Pipeline' object is not callable")
Steve Rowe
  • 11
  • 4

1 Answers1

1

So, I got it to work. I explain below, my understanding of what each change does...

def objective(trial):
"""Define the objective function"""

params = {
    'xgbmodel__max_depth': trial.suggest_int('max_depth', 1, 9),
    'xgbmodel__learning_rate': trial.suggest_loguniform('learning_rate', 0.01, 1.0),
    'xgbmodel__n_estimators': trial.suggest_int('n_estimators', 50, 500),
    'xgbmodel__min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
    'xgbmodel__gamma': trial.suggest_loguniform('gamma', 1e-8, 1.0),
    'xgbmodel__subsample': trial.suggest_loguniform('subsample', 0.01, 1.0),
    'xgbmodel__colsample_bytree': trial.suggest_loguniform('colsample_bytree', 0.01, 1.0),
    'xgbmodel__reg_alpha': trial.suggest_loguniform('reg_alpha', 1e-8, 1.0),
    'xgbmodel__reg_lambda': trial.suggest_loguniform('reg_lambda', 1e-8, 1.0),
    'xgbmodel__eval_metric': 'mlogloss',
    'xgbmodel__use_label_encoder': False
}

# Define model
xgbmodel = XGBClassifier()

# Bundle preprocessing and modeling code in a pipeline
xgb_pipeline = Pipeline(steps=[
                       ('preprocessor', preprocessor),
                        # ('skb', SelectKBest(chi2, k = 10)),
                       ('xgbmodel', xgbmodel)
                       ])

# Fit the random search model

# Fit the model
optuna_model = xgb_pipeline.set_params(**params)
optuna_model.fit(X_train, y_train)

# Make predictions
y_pred = optuna_model.predict(X_valid)

# Evaluate predictions
accuracy = accuracy_score(y_valid, y_pred)
return accuracy

In the params, you need to add the xgbmodel__ This tells the script which step in the pipeline to apply the parameters to. So in this case, the second step 'xgbmodel'.

Then before fitting the model to the train data, you set the parameters using the *set_params(*params) method. This gives an error on it's own, you need to add the class that the method applies to, in this case the pipeline - xgb_pipeline. Hopefully, I have used right terminology.

Steve Rowe
  • 11
  • 4