I'm looking for an explanation for the following issue. I'm fitting an XGB model to a dataset of 50k rows, 26 features.
I'm running a gridsearch with varying max_depths + n_estimators and the model is performing much better with deeper trees (with a depth of 14 im getting c.87 accuracy, c.83 precision, when I reduce the depth to 5 I have a reduced performance of .85 accuracy and .81 precision). On the validation set, the performance is the same for both models (.81 accuracy, .78 precision).
So at surface level it seems like the deeper model is performing the same or better, but when I plot learning curves for the two models, the deeper model looks like its overfitting. The image below shows the learning curves for the two models with the deeper trees at the top and the shallower trees at the bottom.
How can this be explained?