Questions tagged [lightgbm]

LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with the following advantages: ... Support of parallel and GPU learning. Capable of handling large-scale data.

LightGBM is a high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.

Resources:

676 questions
3
votes
1 answer

Supressing optunas cv_agg's binary_logloss output

if I tune a model with the LightGBMTunerCV I always get this massive result of the cv_agg's binary_logloss. If I do this with a bigger dataset, this (unnecessary) io slows down the performance of the optimization process. Here is the code: from…
3
votes
2 answers

Use of 'is_unbalance' parameter in Lightgbm

I am trying to use the 'is_unbalance' parameter in my model training for a binary classification problem where the positive class is approximately 3%. If I set the parameter 'is_unbalance', I observe that the binary log loss drops in the first…
jsanjayce
  • 272
  • 5
  • 15
3
votes
0 answers

How to convert LGBM model from sklearn API to native API after training is completed?

I have trained a lgbm model in sklearn API format just like this: cb_classifier = LGBMClassifier(**params) cb_classifier.fit(X_train[features], y_train, eval_set = (X_validation[features], y_validation), …
jartymcfly
  • 1,945
  • 9
  • 30
  • 51
3
votes
0 answers

LightGBM usage of init_score results in no boosting

It seems if lightgbm.train is used with an initial score (init_score) it cannot boost this score. Here is a simple example: params = {"learning_rate": 0.1,"metric": "binary_logloss","objective": "binary", "boosting_type":…
user1488793
  • 284
  • 2
  • 14
3
votes
3 answers

Error in LightGBM algorithm using tidymodels and treesnip package

I want to try the LightGBM algorithm using tidymodels and treesnip package. Some preproccessing... # remotes::install_github("curso-r/treesnip") # install.packages("titanic") library(tidymodels) library(stringr) …
Edvardoss
  • 393
  • 3
  • 8
3
votes
0 answers

LightGBM benchmark shows no speedup on RTX-2080 GPU over CPU

The Higgs training runs for LightGBM take the same amount of time for me on both GPU and CPU - 26 seconds. Logs confirm that GPU run is using GPU (transferring data to GPU etc.) https://lightgbm.readthedocs.io/en/latest/GPU-Tutorial.html Went…
John Curry
  • 392
  • 3
  • 12
3
votes
1 answer

What does `free_raw_data` do in `lightgbm.Dataset()`?

I've read the docs and an explanation on the FAQ. But the former is just a tautology and the latter explains things with self. as if I would be regularly using Dataset in my own classes. Usually, I load up a dataset and use it to train my models, so…
codeananda
  • 939
  • 1
  • 10
  • 16
3
votes
2 answers

Use 'predict_contrib' in LightGBM to get SHAP-values

In the LightGBM documentation it is stated that one can set predict_contrib=True to predict the SHAP-values. How do we extract the SHAP-values (apart from using the shap package)? I have tried model =…
CutePoison
  • 4,679
  • 5
  • 28
  • 63
3
votes
1 answer

Training logs are not being printed in LightGBM in Jupyter

I am trying to train a simple LightGBM model on a Macbook but its not printing any logs even when verbose parameter is set to 1 (or even greater than 1) param = {'num_leaves':50, 'num_trees':500, 'learning_rate':0.01, 'feature_fraction':1.0,…
silent_dev
  • 1,566
  • 3
  • 20
  • 45
3
votes
1 answer

LightGBM for feature selection

I'm working on a binary classification problem, my training data has millions of records and ~2000 variables. I'm running lightGBM for feature selection and using the features selected from lightGBM to run Neural network (using Keras) model for…
Haritha
  • 641
  • 2
  • 7
  • 12
3
votes
1 answer

negative 'Start training from score'

When running lgb.cv, from the log I sometimes get negative numbers following 'Start training from score'. Wondering what does the number actually mean, and what unit? Is it in terms of the metric specified in params? Here's an excerpt: [LightGBM]…
marychin
  • 31
  • 1
3
votes
1 answer

Writing create_tree_digraph plot to a png file in Python

I want the tree of my lightgbm model to save to a .png format. I have tried two plotting methods from lightgbm API - plot_tree and create_tree_diagraph. import lightgbm as lgb from sklearn.datasets import load_iris X, y = load_iris(True) clf =…
Archana
  • 41
  • 2
  • 5
3
votes
0 answers

How does LightGBM compute feature importance when using 'gain'

I need to calculate features importance for my LightGBM Booster model. However, I cannot understand how are the values for feature importances obtained when using 'gain' type. The docs say: If "gain", result contains total gains of splits which use…
Akim
  • 139
  • 6
3
votes
2 answers

LightGBM plot_tree() Leaf numbers

What do the numbers on the LightGBM plot_tree method represent? As an example, I used the Pima Indian Diabettes dataset and then used the plot_tree method to yield the following: What do the numbers on the leaf nodes represent?
David293836
  • 1,165
  • 2
  • 18
  • 36
3
votes
0 answers

How to improve the performance of LightGBM Ranker?

I have some samples (~5000) with their features, and I want to rank them in terms of a score. I have already built a regression model that directly predicts the score, but I still want to try the learning to rank methods, so I turned to the LightGBM…