I would like to study the optimal tradeoff between bias/variance for model tuning. I'm using caret for R which allows me to plot the performance metric (AUC, accuracy...) against the hyperparameters of the model (mtry, lambda, etc.) and automatically chooses the max. This typically returns a good model, but if I want to dig further and choose a different bias/variance tradeoff I need a learning curve, not a performance curve.
For the sake of simplicity, let's say my model is a random forest, which has just one hyperparameter 'mtry'
I would like to plot the learning curves of both training and test sets. Something like this:
(red curve is the test set)
On the y axis I put an error metric (number of misclassified examples or something like that); on the x axis 'mtry' or alternatively the training set size.
Questions:
Has caret the functionality to iteratively train models based of training set folds different in size? If I have to code by hand, how can I do that?
If I want to put the hyperparameter on the x axis, I need all the models trained by caret::train, not just the final model (the one with maximum performance got after CV). Are these "discarded" model still available after train?