Questions tagged [boosting]

Boosting is a machine learning ensemble meta-algorithm in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Also: Boosting is the process of enhancing the relevancy of a document or field

From [the docs]:

"Boosting" is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.

Also:

From the docs:

Boosting is the process of enhancing the relevancy of a document or field. Field level mapping allows to define an explicit boost level on a specific field. The boost field mapping (applied on the root object) allows to define a boost field mapping where its content will control the boost level of the document.

181 questions
5
votes
1 answer

Generate code for sklearn's GradientBoostingClassifier

I want to generate code (Python for now, but ultimately C) from a trained gradient boosted classifier (from sklearn). As far as I understand it, the model takes an initial predictor, and then adds predictions from sequentially trained regression…
Pokey McPokerson
  • 752
  • 6
  • 17
4
votes
1 answer

Handling unbalanced data in GradientBoostingClassifier using weighted class?

I have a very unbalanced dataset that I need to build a model on top of that for a classification problem. The dataset has around 30000 samples which around 1000 samples are labelled as—1—, and the rest are 0. I build the model by the following…
Spedo
  • 355
  • 3
  • 13
3
votes
1 answer

How to use "is_unbalance" and "scale_pos_weight" parameters in LightGBM for a binary classification project that is unbalanced (80:20)

I am currently having an imbalanced dataset as shown diagram below: Then, I use the 'is_unbalance' parameter by setting it to True when training the LightGBM model. Diagrams below show how I use this parameter. Example of using native API: Example…
3
votes
1 answer

importance ranking: error must be an object of class xgb.Booster

I ran a xgboost regression forecast (also tried to complete it with the xgb.Booster.complete). When trying to get the xgb.importance, I get the error massage Error in xgboost::xgb.importance(case_xgbm) : model: must be an object of class…
user12938856
3
votes
0 answers

LightGBM usage of init_score results in no boosting

It seems if lightgbm.train is used with an initial score (init_score) it cannot boost this score. Here is a simple example: params = {"learning_rate": 0.1,"metric": "binary_logloss","objective": "binary", "boosting_type":…
user1488793
  • 284
  • 2
  • 14
3
votes
1 answer

How to prepare features and labels for XGBoost XGBRanker?

I'm going to use XGBRanker to make a recommender system. But in the official docs I haven't found an example of how to prepare the dataset. So, in which format features and labels should be fitted in XGBRanker?
3
votes
1 answer

xgboost in R: what is the tolerance for xgb.cv's early_stopping_rounds?

In the xgb.cv function (from the library xgboost), one of the options is early_stopping_rounds. The description of this option is: If NULL, the early stopping function is not triggered. If set to an integer k, training with a validation set will…
Adrian
  • 9,229
  • 24
  • 74
  • 132
3
votes
0 answers

How does LightGBM compute feature importance when using 'gain'

I need to calculate features importance for my LightGBM Booster model. However, I cannot understand how are the values for feature importances obtained when using 'gain' type. The docs say: If "gain", result contains total gains of splits which use…
Akim
  • 139
  • 6
3
votes
1 answer

Is there are some ways to know on which subsample of data the XGBoost tree was fitted?

I'm making some XGBoost practice and I'd like to know on which subset of data the trees of XGBRegressor was fitted. Here the list of params that I use: params = {'learning_rate': 0.09, 'n_estimators': 5, 'objective':…
3
votes
1 answer

Pred_leaf in lightgbm

While going through the LightGBM docs I found that predict supports a pred_leaf argument. The docs say pred_leaf (bool, optional (default=False)) – Whether to predict leaf index. However, when doing a data := (1, 28) gbm := num_boost_round =…
IanQ
  • 1,831
  • 5
  • 20
  • 29
3
votes
1 answer

CATBoost and GridSearch

model.fit(train_data, y=label_data, eval_set=eval_dataset) eval_dataset = Pool(val_data, val_labels) model = CatBoostClassifier(depth=8 or 10, iterations=10, task_type="GPU", devices='0-2', eval_metric='Accuracy', boosting_type="Ordered",…
PabloDK
  • 2,181
  • 2
  • 19
  • 42
3
votes
1 answer

How to pass multiple hyperparameters to LightGBM after optimization?

I have used another optimization algorithm that returns me best params for Light GBM. hyper_optimized_clf_classifier = Util.hp_opt(lgb.LGBMClassifier(silent=True, random_state=1), X, y, score, verbose=True, n_estimators…
ERJAN
  • 23,696
  • 23
  • 72
  • 146
3
votes
1 answer

Why does xgboost produce the same predictions and nan values for features when using entire dataset?

Summary I am using Python v3.7 and xgboost v0.81. I have continuous data (y) at a US state level by each week from 2015 to 2019. I'm trying to regress on the following features to y: year, month, week, region (encoded). I've set the train as August…
dnly09
  • 33
  • 1
  • 5
3
votes
3 answers

Plot number formatting in XGBoost plot_importance()

I've trained an XGBoost model and used plot_importance() to plot which features are the most important in the trained model. Although, the numbers in plot have several decimal values which floods the plot and does not fit into the plot. I have…
3
votes
1 answer

Solr difference between Query Elevation and Query Boosting

Can anybody explain the differences between Query Elevation and Boost Query in solr. I couldn't find anything what are the Cons and what are the Pros of these two boosting mechanisms. Thank you very much.
LifeInstructor
  • 1,622
  • 1
  • 20
  • 24
1
2
3
11 12