Questions tagged [lightgbm]

LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with the following advantages: ... Support of parallel and GPU learning. Capable of handling large-scale data.

LightGBM is a high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.

Resources:

676 questions
5
votes
1 answer

LightGBM 'class_weight' parameter: To use with Binary Classification or not?

When dealing with class imbalance issue, penalizing the majority class is a common practice that I have come across while building Machine Learning models. Hence, I often use class weights post re-sampling. LightGBM is one efficient decision tree…
5
votes
0 answers

What is LightGBM's support for missing labels

We have a dataset where some of the labels are missing. This just came to our knowledge recently and we have removed those rows. This got me thinking as to how this ever worked? It doesn't seem to make sense to give a GBM an example without a…
Jon
  • 3,985
  • 7
  • 48
  • 80
5
votes
1 answer

Light GBM early stopping does not work for custom metric

I have used a custom metric for light gbm but early stopping work for log loss which is the objective function how can I fix that or change early stopping to work for eval metric. def evaluate_macroF1_lgb(truth, predictions): pred_labels =…
Anubhav Natani
  • 324
  • 3
  • 11
5
votes
1 answer

LightGBM - sklearnAPI vs training and data structure API and lgb.cv vs gridsearchcv/randomisedsearchcv

What are the differences between the sklearnAPI(LGBMModel, LGBMClassifier etc) and default API(lgb.Dataset, lgb.cv, lgb.train) of lightgbm? Which one should I prefer using? Is it better to use lgb.cv or gridsearchcv/randomisedsearchcv of sklearn…
Sift
  • 633
  • 1
  • 10
  • 18
5
votes
1 answer

Error: "Could NOT find OpenMP_C" when install LightGBM on MacOS

While installing LightGBM on Mac OS, I got the following error: CMake Error at /usr/local/Cellar/cmake/3.12.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:137 (message): Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS…
big bai
  • 61
  • 1
  • 4
5
votes
2 answers

LightGbm: how to use eval_sample_weight

I am working on a binary classification problem in LightGbm (Scikit-learn API), and have a problem understanding how to include sample weights. My code currently looks like this classifier = LGBMClassifier(n_estimators=100, learning_rate = 0.1,…
Petter T
  • 3,387
  • 2
  • 19
  • 31
5
votes
1 answer

LightGBM: Sklearn and Native API equivalence

I'm experimenting LightGBM through Training API http://lightgbm.readthedocs.io/en/latest/Python-API.html#training-api and Scikit-learn API http://lightgbm.readthedocs.io/en/latest/Python-API.html#scikit-learn-api. I've not been able to make a clear…
dokteurwho
  • 321
  • 2
  • 6
5
votes
4 answers

Install lightgbm on windows

I use pip to install lightgbm on windows,Does it work? Since I have an error while importing LGBMRegressor. The error shows as below "cannot import name 'LGBMRegressor'"
Zhixin Zhang
  • 71
  • 1
  • 3
  • 7
5
votes
3 answers

How leave's scores are calculated in this XGBoost trees?

I am looking at the below image. Can someone explain how they are calculated? I though it was -1 for an N and +1 for a yes but then I can't figure out how the little girl has .1. But that doesn't work for tree 2 either.
done_merson
  • 2,800
  • 2
  • 22
  • 30
4
votes
1 answer

Fitting linear function at leaves of a CatBoost model

Is there an equivalent to the linear_tree function in LightGBM in the CatBoost library? I would like to use a linear function at the leaves instead of a constant.
user308827
  • 21,227
  • 87
  • 254
  • 417
4
votes
1 answer

SHAP not working with LightGBM categorical features

My model uses LGBMClassifier. I'd like to use Shap (Shapley) to interpret features. However, Shap gave me errors on categorical features. For example, I have a feature "Smoker" and its values include "Yes" and "No". I got an error from…
Fred Chang
  • 47
  • 1
  • 6
4
votes
1 answer

BayesianOptimization fails due to float error

I want to optimize my HPO of my lightgbm model. I used a Bayesian Optimization process to do so. Sadly my algorithm fails to converge. MRE import warnings import pandas as pd import time import numpy as np warnings.filterwarnings("ignore") import…
4
votes
1 answer

Provide Additional Custom Metric to LightGBM for Early Stopping

I running a binary classification in LightGBM using the training API and want to stop on a custom metric while still tracking one or more builtin metrics. It's not clear if this is possible, though. Here we can disable the default binary_logloss…
Kyle Parsons
  • 1,475
  • 6
  • 14
4
votes
1 answer

How to understand Shapley value for binary classification problem?

I am very new to shapley python package. And I am wondering how should I interpret the shapley value for the Binary Classification problem? Here is what I did so far. Firstly, I used a lightGBM model to fit my data. Something like import shap import…
Xudong
  • 441
  • 5
  • 16
4
votes
2 answers

lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0)

There's a couple of other questions similar to this, but I couldn't find a solution which seems to fit. I am using LightGBM with Scikit-Optimize BayesSearchCV. full_pipeline = skl.Pipeline(steps=[('preprocessor', pre_processor), …
Lucy
  • 179
  • 1
  • 4
  • 14