-1

I have following 10 fold implementation, I am using data set publish by UCI Machine learning, Here is the link for the data set:

Here are my dimensions

x = 

      data: [178x13 double]
    labels: [178x1 double]

This is the error that I am getting

Index exceeds matrix dimensions.

Error in GetTenFold (line 33)
    results_cell{i,2} = shuffledMatrix(testRows ,:);

This is my code:

%Function that accept data file as a name and the number of folds
%For the cross fold
function [results_cell] = GetTenFold(dataFile, x)
%loading the data file
dataMatrix = load(dataFile);
%combine the data and labels as one matrix
X = [dataMatrix.data dataMatrix.labels];
%geting the length of the of matrix
dataRowNumber = length(dataMatrix.data);
%shuffle the matrix while keeping rows intact 
shuffledMatrix = X(randperm(size(X,1)),:);

crossValidationFolds = x;
%Assinging number of rows per fold
numberOfRowsPerFold = dataRowNumber / crossValidationFolds;

crossValidationTrainData = [];
crossValidationTestData = [];
%Assigning 10X2 cell to hold each fold as training and test data
results_cell = cell(10,2);
    %starting from the first row and segment it based on folds
    i = 1;
    for startOfRow = 1:numberOfRowsPerFold:dataRowNumber
        testRows = startOfRow:startOfRow+numberOfRowsPerFold-1;
        if (startOfRow == 1)
            trainRows = (max(testRows)+1:dataRowNumber);
        else
            trainRows = [1:startOfRow-1 max(testRows)+1:dataRowNumber];
            i = i + 1;
        end
        %for i=1:10
        results_cell{i,1} = shuffledMatrix(trainRows ,:);
        results_cell{i,2} = shuffledMatrix(testRows ,:); %This is where I am getting my dimension error
        %end
        %crossValidationTrainData = [crossValidationTrainData ; shuffledMatrix(trainRows ,:)];
        %crossValidationTestData = [crossValidationTestData ;shuffledMatrix(testRows ,:)];
    end
end
Rody Oldenhuis
  • 37,726
  • 7
  • 50
  • 96
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
  • 1
    Rather than provide you with any answer, I think it's far better to give you some pointers on how to solve this thing yourself. Learn to use the debugger. Type `dbstop if error` on the Matlab command prompt. This will cause Matlab to drop into the debugger when the error occurs. When in the debugger, you're inside the function and you can access all the variables defined therein. So when in the debugger, type things like `max(testRows)` and `size(shuffledMatrix,1)`. This will already tell you what's wrong. Type `dbcont` to try to continue, or `dbquit` to exit the debugger. – Rody Oldenhuis Oct 04 '12 at 06:24
  • Thanks I am very new to math lab what i couldn't figure out was when I run this on a different data set which is 150 X 4 it works without any problem. – add-semi-colons Oct 04 '12 at 13:41

1 Answers1

6

You're looping over 1:numberOfRowsPerFold:dataRowNumber which is 1:x:178 and i increments every time. So that's a way you can get the index out of bounds error on results_cell.

Another way to get the error is that testRows selects rows out of bound of shuffledMatrix.

Learn to debug

To pause the code and start debugging when the error occurs, run dbstop if error before executing your code. This way the compiler goes in debug mode upon encountering an error and you can inspect the state of variables right before things mess up.

(to disable this debugging mode, run dbclear if error.)

Gunther Struyf
  • 11,158
  • 2
  • 34
  • 58