Questions tagged [boosting]

Boosting is a machine learning ensemble meta-algorithm in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Also: Boosting is the process of enhancing the relevancy of a document or field

From [the docs]:

"Boosting" is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.

Also:

From the docs:

Boosting is the process of enhancing the relevancy of a document or field. Field level mapping allows to define an explicit boost level on a specific field. The boost field mapping (applied on the root object) allows to define a boost field mapping where its content will control the boost level of the document.

181 questions
0
votes
1 answer

sklearn HistGradientBoostingClassifier - Validation

For the model below, how do output/recreate the validation set so I can save for future reference? from sklearn.experimental import enable_hist_gradient_boosting from sklearn.ensemble import HistGradientBoostingClassifier model=…
0
votes
1 answer

LGBMClassifier + Unbalanced data + GridSearchCV()

The dependent variable is binary, the unbalanced data is 1:10, the dataset has 70k rows, the scoring is the roc curve, and I'm trying to use LGBM + GridSearchCV to get a model. However, I'm struggling with the parameters as sometimes it doesn't…
Chris
  • 2,019
  • 5
  • 22
  • 67
0
votes
1 answer

word's "boosting" during TF-IDF (topic modeling)

Here is the case. Let say we have dataset containing messages from a chat and we want to do a topic modeling on it (few topics for example). Let us assume, that the topic A might be (and should) represented by few words but I know (let say from some…
Yelon
  • 101
  • 2
0
votes
1 answer

How to use a Keras model inside of sklearn's AdaBoost?

I have a Keras model and want to boost it using sklearn's AdaBootClassifier. Unfortunately I get the following error message and have no idea how to solve it. I would be very happy about any help! ValueError Traceback…
0
votes
0 answers

XGBoost - python - fitting a regressor

I'm trying to fit a xgboost regressor in a really large data. I was hoping to use the earlystop in 50 trees if no improvement is made, and to print the evaluation metric in each 10 trees (I'm using RMSE as my main metric). My current code's the…
jmauricio
  • 115
  • 1
  • 7
0
votes
0 answers

Using Linear Regression like gradient boosting

Can i use linear regression like gradient boosting technique does For m=1 to M ( no of linear regressions) : (1) - Fit the model to the data (2) - Predict the Values and find the residuals (3) - Replace the dependent variable data with…
0
votes
1 answer

Boosting algorithm realization in Python

Using sklearn one can construct Bagging algorithm for non-trees estimator (for example, for SVC). But there is no Boosting realization in sklearn or in any other well known packages. Am I missing something and there is some existing Boosting…
Keithx
  • 2,994
  • 15
  • 42
  • 71
0
votes
1 answer

How can I use AdaboostClassfier better?

I have to solve a multiclass classification problem in python. I started to use ensembles and I started from adaboostclassfier but after a gridsearch I get bad results. What I did is to use the tuned classfier (in the list of classfier that I tried)…
fabianod
  • 501
  • 4
  • 17
0
votes
2 answers

Can Ngboost algorithm processing missing values automatic?

I get a new GBDT algorithm named Ngboost invented by stanfordmlgroup. I want to use it and call encode pip install ngboost==0.2.0 to install it. and then I train a dataset that donot impute or delete missing value. however I get a error: Input…
Cthulhu
  • 11
  • 2
0
votes
1 answer

tensorflow boosted tree classifier multi class

In the current version of TF (2.2.0) there is an option to do multi class classification (i.e., more than two classes, by changing n_classes to the relevant number in the estimator params). However, all previous examples that I saw, for example the…
Ron_ad
  • 73
  • 6
0
votes
0 answers

Why is prediction error discrete in adabag?

I've got the table of 55 observations with 5 variables (F,H,R,T,U) and 1 classifier variable ("Group") in which I have two groups. I'm doing data sampling by splitting the data into the training set (70%) and test set (30%). Then I run adaboosting…
0
votes
1 answer

MNIST dataset boosting

I am trying to apply Gradient Boosting to the MNIST dataset. This is my code: library(dplyr) library(caret) mnist <- snedata::download_mnist() mnist_num <- as.data.frame(lapply(mnist[1:10000,], as.numeric)) %>% mutate(id = row_number()) mnist_num…
user12157475
0
votes
0 answers

Boosting in MNIST dataset

I am trying to apply Gradient Boosting to the MNIST dataset. library(gbm) boost_mnist<-gbm(Label~ .,data=mnist_train, distribution="bernoulli", n.trees=70, interaction.depth=4, shrinkage=0.3) yhat_boost<-predict(boost_mnist, newdata=mnist_test,…
user12157475
0
votes
0 answers

CATBoost and prediction variance

Catboost version: 0.21 Operating System: Windows CPU: INTEL I9 I'm running the CATBOOST Python classification tutorial with the Amazon dataset (https://github.com/catboost/tutorials/blob/master/classification/classification_tutorial.ipynb) To make…
PabloDK
  • 2,181
  • 2
  • 19
  • 42
0
votes
1 answer

int vs Float in regression modeling

This is general question to understand a concept. I have a dataframe with all columns having float values(precision varies from 2 to 8 digits). I use GBM to train my model. When i train my model with all float values - r2 score -0.78 Same when all…
CodeTry
  • 312
  • 1
  • 19