I want to use some Light gbm functions properly.
This is standard approach, it's no different than any other classifier from sklearn:
- define X, y
- train_test_split
- create classifier
- fit on train
- predict on test
compare
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) #here maybe DecisionTreeClassifier(), RandomForestClassifier() etc model = lgb.LGBMClassifier() model.fit(X_train, y_train) predicted_y = model.predict(X_test) print(metrics.classification_report())
but light gbm has its own functions like lgb.Dataset, Booster.
However, in this kaggle notebook, it's not calling LightGBMClassifier at all! Why?
what is the standard order to call lgbm functions and train models the 'lgbm' way?
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)
#why need this Dataset wrapper around x_train,y_train?
d_train = lgbm.Dataset(X_train, y_train)
#where is light gbm classifier()?
bst = lgbm.train(params, d_train, 50, early_stopping_rounds=100)
preds = bst.predict(y_test)
why does it train right away?