Questions tagged [boosting]

Boosting is a machine learning ensemble meta-algorithm in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Also: Boosting is the process of enhancing the relevancy of a document or field

From [the docs]:

"Boosting" is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.

Also:

From the docs:

Boosting is the process of enhancing the relevancy of a document or field. Field level mapping allows to define an explicit boost level on a specific field. The boost field mapping (applied on the root object) allows to define a boost field mapping where its content will control the boost level of the document.

181 questions
1
vote
0 answers

Is taking weighted average of trees' prediction considered as boosting?

When constructing random forest, one way is to take the simple average of all trees' predictions. Alternatively, we can also calculate the weight assigned to each tree by a function of error rate. However, is it a kind of boosting? I originally…
1
vote
1 answer

How can I calculate survival function in gbm package analysis?

I would like to analysis my data based on the gradient boosted model. On the other hand, as my data is a kind of cohort, I have a trouble understanding the result of this model. Here's my code. Analysis was performed based on the example…
SJUNLEE
  • 167
  • 2
  • 14
1
vote
0 answers

Elasticsearch: How to boost documents on how early the keywords matches the text?

I wanted to understand how it is possible to boost documents in full text searching on how early the keyword is found the requested field. Example: Document 1 (Message Field): Joe and Sarah went to the store. Document 2 (Message Field): Sarah and…
Bhushan Pant
  • 1,445
  • 2
  • 13
  • 29
1
vote
1 answer

How to use xgboost algorithm for multi-variable prediction?

I have a set of features: x1, x2, x3. Furthermore, I have a set of labels: y1, y2, y3. For example, my x variables are height, weight and years of education. Each Yi represents a grade in the following fields: Science, Arts and Management. Each…
1
vote
0 answers

How to obtain whole decision process of a sklearn GBDT?

When a GBDT is built with the sklearn.ensemble.GradientBoostingClassifier, I have a set of trees. I can figure out the structure of a single tree. But for a set of trees, how do I know in which way the trees are accessed? Take the following codes…
1
vote
1 answer

What are the leaf values in sklearn GBDT, and How do I obtain them?

I can export structure of a GBDT to a image with the tree.export_graphviz function: ``` Python3 from sklearn.datasets import load_iris from sklearn import tree from sklearn.ensemble import GradientBoostingClassifier clf =…
1
vote
1 answer

why tree-based model do not need one-hot encoding for nominal data?

We usually do one-hot encoding for nominal data to make it more reasonable to count the distance among features or the weight, but I often heard that tree-based model like random forest or boosting model do not need do one-hot encoding but I have…
1
vote
0 answers

Scikit-learn GradientBoostingClassifier random_state not working

So I was messing around with different classifiers in sklearn, and found that regardless of the value the random_state parameter GradientBoostingClassifier is in, it always returns the same values. For example, when I run the following code: import…
Skip
  • 83
  • 2
  • 5
1
vote
1 answer

how to calculate GBM accuracy in r

I used the gbm() function to create the model and I want to get the accuracy. Here is my code: df<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE) str(df) F=c(1,2,4,5,7,8,9,10,11,12,13,15,16,17,18,19,20,21) for(i in F)…
신익수
  • 67
  • 3
  • 8
1
vote
0 answers

Boosting multiple "terms" in elasticsearch

Instead of multiple fields to match, I want to give a boosted score. It's part of a bigger query, so unfortunately I cannot avoid having to boost this part of the query. I tried: "terms": { "some.keyword": ["a12c", "b12c"], …
PascalVKooten
  • 20,643
  • 17
  • 103
  • 160
1
vote
0 answers

Why does R tell me I have NAs in my prob distribution when I call the sample() function?

I am running into an issue when I try to run the below function. The exact error I am getting is : Error in sample.int(length(x), size, replace, prob) : NA in probability vector. I use the print(t) line to see where it's stopping, and it seems to…
Pedr
  • 11
  • 2
1
vote
1 answer

Elasticsearch/ Searchkick gem - boosting fields do not return results with special characters (e.g. apostrophes)

We're using the searchkick gem in our app and have many documents with fields that contains special characters such as apostrophes, e.g. an offer with the title Valentine's Day Special. Without boosters, a search for Valentines or Valentine's or…
1
vote
1 answer

different values by fitting a boosted tree twice

I use the R-package adabag to fit boosted trees to a (large) data set (140 observations with 3 845 predictors). I executed this method twice with same parameter and same data set and each time different values of the accuracy returned (I defined a…
bjn
  • 195
  • 1
  • 7
1
vote
1 answer

Error with XGBoost setup

I'm pretty new to R and having some trouble with the XGBoost function. This is the code I have so far: test_rows <- sample.int(nrow(ccdata), nrow(ccdata)/3) test <- ccdata[test_rows,] train <-…
New2R
  • 11
  • 1
  • 2
1
vote
3 answers

What are some specific examples of Ensemble Learning?

What are some concrete real life examples which can be solved using Boosting/Bagging algorithms? Code snippets would be greatly appreciated.
28r
  • 329
  • 1
  • 4
  • 11