My understanding of gradient boosting is this...
We can make the model much more complex by creating lots of decision trees sequentially. Each decision trees build on each other. The goal of each new tree is to fix the errors where the previous trees are the most wrong. If we had 3,000 decision trees, this means that the errors are minimized 3,000 times. By the end we would have reduced the errors.
Are there any fault in my understanding?