0

I am trying to use BayesSearchCV but I got an unexpected error. I don't use iid parameter but the error keeps saying __init__() got an unexpected keyword argument 'iid'. I will share my code here.

Code:

roc_auc = make_scorer(roc_auc_score, greater_is_better=True, needs_threshold=True)
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=1234)

clf = CatBoostClassifier(thread_count=2,
                         loss_function='Logloss',
                        
                         od_type = 'Iter',
                         verbose= False
                        )

# Defining your search space
search_spaces = {'iterations': Integer(10, 1000),
                 'depth': Integer(1, 8),
                 'learning_rate': Real(0.01, 1.0, 'log-uniform'),
                 'random_strength': Real(1e-9, 10, 'log-uniform'),
                 'bagging_temperature': Real(0.0, 1.0),
                 'border_count': Integer(1, 255),
                 'l2_leaf_reg': Integer(2, 30),
                 'scale_pos_weight':Real(0.01, 1.0, 'uniform')}

# Setting up BayesSearchCV
opt = BayesSearchCV(clf,
                    search_spaces,
                    scoring=roc_auc,
                    cv=skf,
                    n_iter=100,
                    n_jobs=1,  # use just 1 job with CatBoost in order to avoid segmentation fault
                    return_train_score=False,
                    refit=True,
                    optimizer_kwargs={'base_estimator': 'GP'}
)

Error message:

Error

ForceBru
  • 43,482
  • 10
  • 63
  • 98
Ali Yusifov
  • 3
  • 2
  • 5
  • What versions of `scikit-optimize` and `scikit-learn` do you have installed? `iid` doesn't appear in your code, so the "bug" is in `skopt/searchcv.py`. – chepner Jul 11 '21 at 13:38

1 Answers1

1

If you are using skopt version 0.8.1 and a version of scikit-learn >= 0.24, then there has been some incompatibilities between the two libraries.

  1. sklearn no longer uses the iid parameter. Possible fix is edit searchcv.py to add a new iid member variable in BayesSearchCV, initialize its value, and do not pass it to the base class constructor. e.g.
self.iid = iid
    
super(BayesSearchCV, self).__init__(
    estimator=estimator, scoring=scoring, n_jobs=n_jobs, refit=refit, cv=cv, verbose=verbose,
    pre_dispatch=pre_dispatch, error_score=error_score,
    return_train_score=return_train_score)
  1. BayesSearchCV._fit expects Parallel to out five elements that are lists, but now it is flipped -- it is returning a list of dicts. Replace the
(test_scores, test_sample_counts, fit_time, score_time, parameters) = zip(*out)

with

test_scores = [d['test_scores'] for d in out]
test_sample_counts = [d['n_test_samples'] for d in out]
fit_time = [d['fit_time'] for d in out]
score_time = [d['score_time'] for d in out]
parameters = [d['parameters'] for d in out]

Other options include using a newer version of skopt (e.g. 0.9 has resolved some compatibility issues) or downgrading to an older version of sklearn. Discussion of this can be see here.

First Last
  • 73
  • 4