5

Suppose that we have this code in MATLAB R2015b:

SVMModel = fitcsvm(INPUT, output,'KernelFunction','RBF','BoxConstraint',1);
CVSVMModel = crossval(SVMModel);
z = kfoldLoss(CVSVMModel)
  • In the first line using fitcsvm model trained by hole data. what is the purpose of setting Crossval to on in fitcsvm (as default we have 10-fold cross-validation with this option)? crossval and kfoldLoss using the same method as above? If yes why MATLAB documentation mentioned only this method not setting Crossval method for cross-validation? If these proceedings are the same how we can get the error rate using the first procedure?

  • When we want to predict feature (this is a prediction model) we need use model which trained with hole data (here it is SVMModel object)? So crossval and kfoldLoss are using for only calculating error we don't 10 trained model of this validation method for prediction. Is this is true? Is using whole data valid for neural network models?

Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
Eghbal
  • 3,892
  • 13
  • 51
  • 112

1 Answers1

4

Regarding the first question. Both setting "CrossVal" to "on" and fetching the trained model to the crossval() function aim at the same thing. You can use one or the other, it's up to you.

kFoldLoss() is a function per-se, is not included in the "CrossVal" flag. It takes as input a cross-validated model. No matter if you cross-validated such model using the "CrossVal" flag in fitcsvm() or using the proper crossval() function. It is mandatory to use this function is you want to evaluate the error rate.

Regarding the second question now, the short answer is yes. You have to use the trained Support Vector Machine model as returned by fitcsvm(). The cross-validation procedure aims at validating your model, so you have an idea regarding its performances (and the 10-fold cross-validation is just one of the many methods available) but it does not perform any prediction. For that, you have to use the predict() function. I reckon you have a training set and a test set (or validation set) with their respective labels. With the training set, you train the SVM model whereas you use the validation set to perform the prediction phase. The main output of predict() is the vector of labels that model has predicted and you can match such predicted labels with the true labels of your validation set, to gather the error rate in validation.

I suggest you avoid the "CrossVal" flag, in this manner you have the situation under-control since you'll have:

  1. the trained model, output of fitcsvm()
  2. the cross-validated model, output of crossval(), and you can as well evaluate its performances with kFoldLoss()
  3. the predicted labels, using predict() with the trained model in step #1    
Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
AlessioX
  • 3,167
  • 6
  • 24
  • 40
  • Thank you for answer. About the second question, in k-fold class-validation we are training 10 models using 10 sub samples (different 10 models) and we have different model when using whole data as training. The returned error rate of class validation is dependable due to this behavior? – Eghbal Feb 13 '16 at 20:30
  • 1
    In the 10-fold crossvalidation you slice your entire training set in 10 subsets and the cross-validation process is then repeated 10 times, with each of the 10 subsets used exactly once as the validation set. You get 10 error rates, sure, which are combined (e.g. averaged) to return a single value. – AlessioX Feb 13 '16 at 20:33
  • That's true but for final model we are using whole data as training so here we have different tuned SVM comparing previous 10 tuned models in cross validation for calculating error. This problem is more obvious when we are using neural network – Eghbal Feb 13 '16 at 20:35
  • 2
    You do not use the cross-validated model for prediction. You use the "whole-data trained" model for prediction. Keep in mind that prediction and validation are two different phases in machine learning. – AlessioX Feb 13 '16 at 20:36
  • Suppose that we are fitting a neural network (considering random nature of initial weights and biases). I'm saying that these sub trained models don't reflect real error rate because these models are completely different comparing the main model (that it will used) because these models trained with different sub samples so we have different tunes weights and biases. – Eghbal Feb 13 '16 at 20:39
  • 1
    I see your point. But keep in mind that the only 1/10th of the whole data is not available for training and is used for validation. Also each pattern is used for both training (9 times) and validation (1 time), that's pretty robust. If you want a more reliable error rate, just check how many predicted labels are different from the validation labels. As we said, in the prediction phase you use the "whole-data-trained" model, without touching the cross-validated model. – AlessioX Feb 13 '16 at 20:43
  • Thank you for guidance and answers. – Eghbal Feb 13 '16 at 20:45
  • 1
    You're most certainly welcome. Keep up with the good work, I love SVMs! – AlessioX Feb 13 '16 at 20:46
  • Alessiox, a question w.r.t the differences of crossval as a parameter of fitcsvm vs. crossval as a function: In the first case (crossval = "on"), we take the input data passed to fitcsvm, partition it and will get 10 classifiers with 10 different classification accuracies. How exactly is the crossvalidation process when we use crossval-function after having trained with fitcsvm? Does it do the exact same 10-fold cv on the exact same dataset as before? If so, what is the SVM trained in fitcsvm good for? Is this what you then would use for prediction (in contrast to what is trained by crossval)? – Pugl Sep 05 '17 at 07:43