-1

I'm trying to decide, which one of the following I will use in practice for regression tasks: xgboost, lightgbm or catboost (python 3).

So, what are general idea behind each of them? Why should I choose one, but not another?

I'm not interested in very slight difference in the accuracy score like 0.781 vs 0.782. Result should be tenable, and my tool should be robust, convenient in use. The workhorse.

awqwewrqaq
  • 17
  • 4

2 Answers2

0

As I understand about these methods, Just how they are implemented is different, otherwise they have implemented GBM methods.

So you should just try to do some hyper parameter tuning. Also, its good idea to read this paper: catboost-vs-light-gbm-vs-xgboost

stonechat
  • 11
  • 3
0

You cannot determine a priori which Tree algorithm (or any algorithm) will be automatically the best. This is because of the https://en.wikipedia.org/wiki/No_free_lunch_theorem

It's best to try them all out. You should also throw in Random Forest (RF) as another one to try.

I will say that http://CatBoost.ai (CB) does have one advantage over the others: if you have Categorical Variables, CB will most likely beat the others because it can handle categorical variables directly without One-Hot-Encoding.

You might try http://H2O.ai 's grid search which supports several algorithms (RF, XGBoost, GBM, Linear Regression) with Hypertuning of parameters to see which one works best. You can run this overnight. (CB is not included in H2O's grid search)

Clem Wang
  • 689
  • 8
  • 14