4

I have a dataset with the following dimensions for training and testing sets:

X_train = (58149, 9)
y_train = (58149,)
X_test = (24921, 9) 
y_test = (24921,)

The code that I have for RandomizedSearchCV using LightGBM classifier is as follows:

# Parameters to be used for RandomizedSearchCV-
rs_params = {
        # 'bagging_fraction': [0.6, 0.66, 0.7],
        'bagging_fraction': sp_uniform(0.5, 0.8),
        'bagging_frequency': sp_randint(5, 8),
        # 'feature_fraction': [0.6, 0.66, 0.7],
        'feature_fraction': sp_uniform(0.5, 0.8),
        'max_depth': sp_randint(10, 13),
        'min_data_in_leaf': sp_randint(90, 120),
        'num_leaves': sp_randint(1200, 1550)

}

# Initialize a RandomizedSearchCV object using 5-fold CV-
rs_cv = RandomizedSearchCV(estimator=lgb.LGBMClassifier(), param_distributions=rs_params, cv = 5, n_iter=100)

# Train on training data-
rs_cv.fit(X_train, y_train)

When I execute this code, it gives me the following error:

LightGBMError: Check failed: bagging_fraction <=1.0 at /__w/1/s/python-package/compile/src/io/config_auto.cpp, line 295.

Any idea as to what's going wrong?

Flavia Giammarino
  • 7,987
  • 11
  • 30
  • 40
Arun
  • 2,222
  • 7
  • 43
  • 78

1 Answers1

2

I have removed sp_uniform and sp_randint from your code and it is working well

from sklearn.model_selection import RandomizedSearchCV
import lightgbm as lgb
np.random.seed(0)

d1 = np.random.randint(2, size=(100, 9))
d2 = np.random.randint(3, size=(100, 9))
d3 = np.random.randint(4, size=(100, 9))

Y = np.random.randint(7, size=(100,))
X = np.column_stack([d1, d2, d3])

rs_params = {
        'bagging_fraction': (0.5, 0.8),
        'bagging_frequency': (5, 8),
        'feature_fraction': (0.5, 0.8),
        'max_depth': (10, 13),
        'min_data_in_leaf': (90, 120),
        'num_leaves': (1200, 1550)
}

# Initialize a RandomizedSearchCV object using 5-fold CV-
rs_cv = RandomizedSearchCV(estimator=lgb.LGBMClassifier(), param_distributions=rs_params, cv = 5, n_iter=100,verbose=1)

# Train on training data-
rs_cv.fit(X, Y,verbose=1)

And according to the documentation bagging_fraction will be <=0 || >=1.

Add verbose=1 so that you can see fittings of your model, verbose gives us the information of your model.

Flavia Giammarino
  • 7,987
  • 11
  • 30
  • 40
Sohaib Anwaar
  • 1,517
  • 1
  • 12
  • 29