CatBoost is an open-source gradient boosting on decision trees library with categorical features support out of the box for Python, R
Questions tagged [catboost]
344 questions
2
votes
2 answers
Is there a dark mode for Catboost fitting plot?
It seems there is no darkmode easily accessible for catboost fitting plot.
The documentation does not seem to contain anything on the subject.
I am running my Jupyter Notebook into VS code and I am using these lines to get dark modes with seaborn…

Kkameleon
- 163
- 2
- 14
2
votes
1 answer
[Catboost][ClearML] Error: if loss-function is Logloss, then class weights should be given for 0 and 1 classes
Having recently started using ClearML to manage the MLOps, I am facing the following problem:
When running a script that trains a CatBoost in a binary classification problem using different class weights from my computer, it works perfectly, logs…

Plinio Zanini
- 23
- 2
2
votes
1 answer
CatBoost eval_set not working inside scikit-learn pipeline
I am trying to pass X_valid dataset into the eval_set parameters in the fit function from CatBoost library (this is the link to the documentation) but I am getting the following error:
ValueError: Pipeline.fit does not accept the cat_features…

Lucas Dresl
- 1,150
- 1
- 10
- 19
2
votes
2 answers
Obtaining summary shap plot for catboost model with tidymodels in R
I am trying to build a catboost model within the tidymodels framework. Minimal reproducible example is given below. I am able to use the DALEX and modelStudio packages to get model explanations but I want to create VIP plots like this and summary…

Rizwan S A
- 77
- 5
2
votes
1 answer
Get Confidence probability Scores for each Predicted Result in Catboost Classifier
I have built a machine learning model using Catboost classifier to predict the categoryname of my result as per below screenshot1. However, if I get an unknown as input or any input with which the model has not been trained with, then I need to…

SMR
- 401
- 4
- 15
2
votes
1 answer
CatBoost: Are we overfitting?
Our team is currently using CatBoost to develop credit scoring models, and our current process is to...
Sort the data chronologically for out-of-time sampling, and split it into train, valid, and test sets
Perform feature engineering
Perform…

swritchie
- 31
- 4
2
votes
1 answer
Error related to labels when tuning catboost in tidymodels
Here is the model:
cb_spec <- boost_tree(
mode = "classification",
trees = 1000,
tree_depth = tune(),
min_n = tune(),
mtry = tune(),
learn_rate = tune()
) %>%
set_engine("catboost", loss_function = "Logloss", task_type = "GPU")
Here…

tedescr
- 53
- 6
2
votes
1 answer
How do I utilize the MAP eval metric in Catboost to calculate Mean Average Precision?
I have been using a custom metric for Precision-Recall AUC in Catboost. However, it iterates slow and is incompatible with GPU. I see Catboost has a metric "MAP" for Mean Average Precision which is what I need for my (binary) classification…

Aaron England
- 1,223
- 1
- 14
- 26
2
votes
1 answer
Offline installation of R catboost package on ubuntu
I am working on Azure databricks and it's compute server is Ubuntu 18.04. I want to install catboost R package but without internet access because of security reasons. I downloaded github repo of catboost on my MacBook that has internet access and…

Amir
- 685
- 3
- 13
- 36
2
votes
1 answer
How to specify more than one eval_metric for a CatBoostRegressor?
I want to specify more than one evaluation metric for my CatBoostRegressor:
model=catboost.CatBoostRegressor(eval_metric=['RMSE', 'MAE', 'R2'])
So I can get the results very simple with the .get_best_score() method, but it does not accept the…

Blate Raven
- 23
- 3
2
votes
0 answers
How do I pass the values to Catboost?
I'm trying to work with catboost and I've got a problem that I'm really stuck with right now. I have a dataframe with 28 columns, 2 of them are categorical. When the data is numerical there are some even and some fractional numbers, also some 0.00…

runny quasar
- 21
- 1
2
votes
1 answer
Loading data into Catboost Pool object
I'm training a Catboost model and using a Pool object as following:
pool = Pool(data=x_train, label=y_train, cat_features=cat_cols)
eval_set = Pool(data=x_validation, label=y_validation['Label'], cat_features=cat_cols)
model.fit(pool,…

nofar mishraki
- 526
- 1
- 4
- 15
2
votes
2 answers
CatBoost on GPU provides much worse performance than on CPU
We are testing CatBoost on both CPU and GPU.
While it runs much faster on GPU than on CPU, the results we are getting are so much worse and we are using the same data.
I am talking around 50% worse.
How is this possible?
We are using the following…

Amit Raz
- 5,370
- 8
- 36
- 63
2
votes
0 answers
Putting weights on values of a categorical feature
Suppose we have the following dataset
df = pd.DataFrame({'feature 1':['a','b','c','d','e'],
'feature 2':[1,2,3,4,5],'y':[1,0,0,1,1]})
as we can see feature 1 is categorical. In usual tree-based models as in XGBoost or CatBoost, the values under…

Wiliam
- 1,078
- 10
- 21
2
votes
1 answer
Why do my CatBoost fit metrics are different than the sklearn evaluation metrics?
I'm still not sure this should be a question for this forum or for Cross-Validated, but I'll try this one, since it's more about the output of the code than the technique per se. Here's the thing, I'm running a CatBoost Classifier, just like this:
#…

dekio
- 810
- 3
- 16
- 33