Questions tagged [xgbclassifier]

112 questions
2
votes
1 answer

XGboost and XGb.DMatrix

am getting this error when using XGBoost in R Error in xgb.DMatrix(data, label = label, missing = missing) : 'data' has class 'character' and length 1791. 'data' accepts either a numeric matrix or a single filename. Below is the code am…
ibocus
  • 45
  • 1
  • 9
2
votes
1 answer

How to fix 'name 'cross_validation' is not defined' error in python

I am trying to run XGBClassifier parameter tuning and get a "'name 'cross_validation' is not defined" error following this line of code: kfold_5 = cross_validation.KFold(n = len(X), shuffle = True, n_folds = numFolds) Maybe I didn't import the…
Erez Ben-Moshe
  • 149
  • 2
  • 3
  • 10
1
vote
1 answer

Fine-tuning an already trained XGBoost classification model

I have trained an XGBoost classification model for sentiment analysis of product reviews. However, there are certain cases where the model predictions are not as expected. For example, when I input the review "The delivery was a bit late but the…
Chris
  • 154
  • 8
1
vote
1 answer

XGBoost - huge difference between xgb.cv and cross_val_score

I was performing cross-validation using xgboost.cv but then wanted to change to cross_val_score to use it with GridSearchCV. Before moving to hyperparameters tuning I checked if results from xgboost.cv and cross_val_score are similar and found out…
overb
  • 127
  • 2
  • 10
1
vote
0 answers

XGBoost train and test logloss curves are exactly the same

I'm using the XGBoost model and I am having some troubles generalizing my model. I tried to visualize the learning curves of my train and test sets. However, both are exactly the same. It looks like an error to me, but I do not know the reason. The…
user15565396
1
vote
0 answers

Index(or index-like) variable has a higher feature importance than rest of the variables?

While evaluating xgboost model performance, I find that transaction_id column which is just a column of numbers from 1 to length of dataframe has a higher importance than the rest of the columns. I also have random values column which has a zero…
1
vote
1 answer

F1/F0.5 score as eval_metric in XGBClassifier

I'm performing a classification task using XGBClassifier - I want to reuse sklearn's functionalities as much as possible. Especially I'm interested in defining my custom scorer using f_beta function to define f0.5 score. When I run the following…
Roberto
  • 649
  • 1
  • 8
  • 22
1
vote
0 answers

XGB Classifier error Invalid classes inferred from unique values of `y`

This is my first question here. I've trained an XGB Classifier and it worked fine on local, but I'm trying the same in a jupyter notebook on a google cloud virtual machine and it gets an error. My code: `param_grid = {"max_depth": [3, None], …
Jorge Luis
  • 11
  • 1
  • 2
1
vote
1 answer

Using base_score with XGBClassifier to provide initial priors for each target class

When using XGBRegressor, it's possible to use the base_score setting to set the initial prediction value for all data points. Typically that value would be set to the mean of the observed value in the training set. Is it possible to achieve a…
Jivan
  • 21,522
  • 15
  • 80
  • 131
1
vote
0 answers

Using a dictionary of multidimensional data for training

I would like to train my XGBoost model using a dictionary as training data, containing 5 keys and each key having windows of an array of data. Here is the figure explaning it: I cannot give this data directly to the model (using .fit) because for…
1
vote
1 answer

Binary classification prediction confidence 2: Electric Boogaloo

Ive been working on an XGBoost Classifier with 5 classes for which I used to get the confidence of each prediction from the predict_proba() function (total sum of the confidences would add up to 1, easy enough). Now that Ive switched to a Binary…
1
vote
1 answer

How to prevent features to interact with each other in python XGBClassifier model

I have trained this model: model = XGBClassifier( #XGBClassifier objective='binary:logistic', base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, enable_categorical=False, …
Giampaolo Levorato
  • 1,055
  • 1
  • 8
  • 22
1
vote
0 answers

How to retrieve best model from xgboost.train

I'm learning how to use XGBClassifier to generate predictions, and I found out that xgboost.train is what XGBClassifier calls under the hood. I guess the first question is: is there any reason to favor one way over another, or are they not…
1
vote
2 answers

Explaination of SHAP value from XGBoost

I had fitted a XGBoost model for binary classification. I am trying to understand the fitted model and trying to use SHAP to explain the prediction. However, I get confused by the force plot generated by SHAP. I expected the output value should be…
Felix Chan
  • 185
  • 1
  • 2
  • 9
1
vote
0 answers

Predictive model performs exceedingly well during training and testing, but predicts zero when predicting the very same data

I've created a binary classification model which predicts whether an article is part of the positive or negative class. I am using TF-IDF fed into an XGBoost classifier alongside another feature. I get an AUC score of very close to 1 when both…
NickS
  • 31
  • 3