1

I'm currently using CatboostRegressor(iterations=500, random_seed=123, cat_features=['month_number', 'day_of_week', 'year']) for developing a 1-year predictive model at a daily level. The predictor variables are time feature date, categorical features A1 - A4 (for example, A1 = Group: 1/2/3/4, A2 = Class: Upper/middle/lower) , and numerical features A5 - A7 (for example, A5 = Price: 100/125.6/132.7, A6 = Age 25/30/31/57). The response variable is a continuous variable, such as revenue or rental price.

My current model with CatboostRegressor(iterations=500, random_seed=123) with two initialized parameters loss_function = MAE, eval_metric = 'MAE' mostly underfits the actualized data although the final MAPE was around 10%-20% (so, some overfits offsets the underfits). However, I am unsure how the MAE function described in the documentation of Catboost defines the weight w_i for each prediction error, as the MAE is the weighted average of all the prediction errors. In particular, what do they mean by saying Use object/group weights to calculate metrics if the specified value is true (https://stats.stackexchange.com/questions/ask). I am unable to find any examples on how these weights are set, because if the weights are equally weighted, the model would probably not under-fit?

What I have tried so far to improve the under-fit issue is that I already used almost the entire training dataset to train the CatboostRegressor(), it is not a feasible option for me to increase training size (since the model needs to be trained fast enough in production environment). The most potential option now is to increase the number of iterations from 500 to 1000, although I tried this option but it did not help improve the MAPE.

Question. Is it true that default parameters of CatBoostRegressor() would change dynamically based on the dataset? What is the recommended value to tune the learning rate and number of trees. Also customize the loss_function and eval_metric to cope with underfit issue? The tutorial provided for writing custom loss function is quite complicated that I couldn't follow, so any other tangible examples would be appreciated.

user177196
  • 738
  • 1
  • 8
  • 16

0 Answers0