9

Upon some research I found two functions in MATLAB to do the task:

Now I've used the cvpartition to create n-fold cross validation subsets before, along with the Dataset/Nominal classes from the Statistics toolbox. So I'm just wondering what are the differences between the two and the pros/cons of each?

Jonas Heidelberg
  • 4,984
  • 1
  • 27
  • 41
Amro
  • 123,847
  • 25
  • 243
  • 454

4 Answers4

3

Expanding on @Mr Fooz's answer

They look to be pretty similar based on the official docs of cvpartition and crossvalind, but crossvalind looks slightly more flexible (it allows for leave M out for arbitrary M, whereas cvpartition only allows for leave 1 out).

... isn't it true that you can always simulate a leave-M-out using kfold cross validation with an appropriate k value (split data into k fold, test on one, train on all others, and do this for all folds and take average) since leave-one-out is a special case of kfold where k=number of observations?

Community
  • 1
  • 1
3

Amro, this is not directly an answer to your cvpartition vs crossvalind question, but there is a contribution at the Mathworks File Exchange called MulticlassGentleAdaboosting by user Sebastian Paris that includes a nice set of functions for enumerating array indices for computing training, testing and validation sets for the following sampling and cross-validation strategies:

  • Hold out
  • Bootstrap
  • K Cross-validation
  • Leave One Out
  • Stratified Cross Validation
  • Balanced Stratified Cross Validation
  • Stratified Hold out
  • Stratified Boot Strap

For details, see the demo files included in the package, and more specifically the functions sampling.m and sampling_set.m.

Community
  • 1
  • 1
Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564
1

They look to be pretty similar based on the official docs of cvpartition and crossvalind, but crossvalind looks slightly more flexible (it allows for leave M out for arbitrary M, whereas cvpartition only allows for leave 1 out).

Mr Fooz
  • 109,094
  • 6
  • 73
  • 101
1

I know your question is not directly referring to the Neural network toolbox, but perhaps someone else might find this useful. To get your ANN input data seperated in to test/validation/train data, use the 'net.divideFcn' variable.

net.divideFcn = 'divideind';

net.divideParam.trainInd=1:94;  % The first 94 inputs are for training.
net.divideParam.valInd=1:94;    % The first 94 inputs are for validation.
net.divideParam.testInd=95:100; % The last 5 inputs are for testing the network.
Aman
  • 151
  • 9