Questions tagged [catboost]

CatBoost is an open-source gradient boosting on decision trees library with categorical features support out of the box for Python, R

344 questions
2
votes
2 answers

Is there a dark mode for Catboost fitting plot?

It seems there is no darkmode easily accessible for catboost fitting plot. The documentation does not seem to contain anything on the subject. I am running my Jupyter Notebook into VS code and I am using these lines to get dark modes with seaborn…
Kkameleon
  • 163
  • 2
  • 14
2
votes
1 answer

[Catboost][ClearML] Error: if loss-function is Logloss, then class weights should be given for 0 and 1 classes

Having recently started using ClearML to manage the MLOps, I am facing the following problem: When running a script that trains a CatBoost in a binary classification problem using different class weights from my computer, it works perfectly, logs…
2
votes
1 answer

CatBoost eval_set not working inside scikit-learn pipeline

I am trying to pass X_valid dataset into the eval_set parameters in the fit function from CatBoost library (this is the link to the documentation) but I am getting the following error: ValueError: Pipeline.fit does not accept the cat_features…
Lucas Dresl
  • 1,150
  • 1
  • 10
  • 19
2
votes
2 answers

Obtaining summary shap plot for catboost model with tidymodels in R

I am trying to build a catboost model within the tidymodels framework. Minimal reproducible example is given below. I am able to use the DALEX and modelStudio packages to get model explanations but I want to create VIP plots like this and summary…
Rizwan S A
  • 77
  • 5
2
votes
1 answer

Get Confidence probability Scores for each Predicted Result in Catboost Classifier

I have built a machine learning model using Catboost classifier to predict the categoryname of my result as per below screenshot1. However, if I get an unknown as input or any input with which the model has not been trained with, then I need to…
SMR
  • 401
  • 4
  • 15
2
votes
1 answer

CatBoost: Are we overfitting?

Our team is currently using CatBoost to develop credit scoring models, and our current process is to... Sort the data chronologically for out-of-time sampling, and split it into train, valid, and test sets Perform feature engineering Perform…
2
votes
1 answer

Error related to labels when tuning catboost in tidymodels

Here is the model: cb_spec <- boost_tree( mode = "classification", trees = 1000, tree_depth = tune(), min_n = tune(), mtry = tune(), learn_rate = tune() ) %>% set_engine("catboost", loss_function = "Logloss", task_type = "GPU") Here…
tedescr
  • 53
  • 6
2
votes
1 answer

How do I utilize the MAP eval metric in Catboost to calculate Mean Average Precision?

I have been using a custom metric for Precision-Recall AUC in Catboost. However, it iterates slow and is incompatible with GPU. I see Catboost has a metric "MAP" for Mean Average Precision which is what I need for my (binary) classification…
Aaron England
  • 1,223
  • 1
  • 14
  • 26
2
votes
1 answer

Offline installation of R catboost package on ubuntu

I am working on Azure databricks and it's compute server is Ubuntu 18.04. I want to install catboost R package but without internet access because of security reasons. I downloaded github repo of catboost on my MacBook that has internet access and…
Amir
  • 685
  • 3
  • 13
  • 36
2
votes
1 answer

How to specify more than one eval_metric for a CatBoostRegressor?

I want to specify more than one evaluation metric for my CatBoostRegressor: model=catboost.CatBoostRegressor(eval_metric=['RMSE', 'MAE', 'R2']) So I can get the results very simple with the .get_best_score() method, but it does not accept the…
2
votes
0 answers

How do I pass the values to Catboost?

I'm trying to work with catboost and I've got a problem that I'm really stuck with right now. I have a dataframe with 28 columns, 2 of them are categorical. When the data is numerical there are some even and some fractional numbers, also some 0.00…
2
votes
1 answer

Loading data into Catboost Pool object

I'm training a Catboost model and using a Pool object as following: pool = Pool(data=x_train, label=y_train, cat_features=cat_cols) eval_set = Pool(data=x_validation, label=y_validation['Label'], cat_features=cat_cols) model.fit(pool,…
nofar mishraki
  • 526
  • 1
  • 4
  • 15
2
votes
2 answers

CatBoost on GPU provides much worse performance than on CPU

We are testing CatBoost on both CPU and GPU. While it runs much faster on GPU than on CPU, the results we are getting are so much worse and we are using the same data. I am talking around 50% worse. How is this possible? We are using the following…
Amit Raz
  • 5,370
  • 8
  • 36
  • 63
2
votes
0 answers

Putting weights on values of a categorical feature

Suppose we have the following dataset df = pd.DataFrame({'feature 1':['a','b','c','d','e'], 'feature 2':[1,2,3,4,5],'y':[1,0,0,1,1]}) as we can see feature 1 is categorical. In usual tree-based models as in XGBoost or CatBoost, the values under…
2
votes
1 answer

Why do my CatBoost fit metrics are different than the sklearn evaluation metrics?

I'm still not sure this should be a question for this forum or for Cross-Validated, but I'll try this one, since it's more about the output of the code than the technique per se. Here's the thing, I'm running a CatBoost Classifier, just like this: #…
dekio
  • 810
  • 3
  • 16
  • 33