How can I use LGBM early stopping rounds parameters in a Scikit-Learn pipeline?

Question

I've been trying this for a while, but still couldn't figure out the solution.

I have a pipeline with a few steps and a LGBM classifier, which I want to use with early_stopping_round parameter. However, I keep getting errors regarding this parameter. This was my last try:

model = lgb.LGBMClassifier(
    class_weight={0: class_weights[0], 1: class_weights[1]},
    early_stopping_round=50,
    eval_metric="logloss",
    learning_rate=0.1,
)

pipe = Pipeline(
    [
        ("vect",CountVectorizer(),),
        ("feature_sel", SelectKBest(chi2, k=200)),
        ("model", model),
    ]
)


pipe.fit(
    X_train,
    y_train,
    model__eval_set=[X_test, y_test],
)

y_pred = model.predict(X_test)

print(f"Accuracy: {accuracy_score(y_test, y_pred):.3f}")
print(f"F1-score: {f1_score(y_test, y_pred):.3f}")
print(f"Recall-score: {recall_score(y_test, y_pred):.3f}")
print(f"Precision-score: {precision_score(y_test, y_pred):.3f}")
print(f"ROC AUC: {roc_auc_score(y_test, y_pred):.3f}")
print(f"Logloss: {log_loss(y_test, y_pred):.3f}")

I got this error:

ValueError: too many values to unpack (expected 2)

I already tried some insights from these discussions: this, this, this and this. Nothing seem to work for my case. Any idea?

It's a pretty common dataset, I've been trying this pipeline in the spam dataset from Kaggle.

How can I use LGBM early stopping rounds parameters in a Scikit-Learn pipeline?

0 Answers0