-2

I have scaled my training data and tried to do cross validation to obtain the best parameters but I don't know how to do. I tried to read my scaled training data and assign them to an svm_problem variable:

svm_node My_svm_node[16400][157];
svm_node temp[157];
FILE *fp =NULL;
fp = fopen("Scaled_Train_Data.txt","r");   //my data is in fp
for(int LineNumber = 0 ; stop !=1 ; LineNumber++)
{   
    //std::cout<<"Line Number "<<LineNumber<<" Is processed .. \n";
    if (readline(fp)==NULL)
    {
        stop = 1;
        break;
    }
    char *p=line;
    int next_index=1;
    int index = 0 ;
    double target;
    double value;

    sscanf(p,"%lf",&target);
    while(isspace(*p)) ++p;     //remove any spaces betweeen numbers ...
    while(!isspace(*p)) ++p;

    while(sscanf(p,"%d:%lf",&index,&value)==2)
    {
        for(i=next_index;i<index;i++)
        {
            temp[i-1].index = i;
            temp[i-1].value = 0;
        }
        temp[index-1].index = index;
        temp[index-1].value = value;
        while(*p!=':') ++p;                         //check to see if we obey the rule of libsvm
        ++p;                                        
        while(isspace(*p)) ++p;                     //remove any spaces between numbers 
        while(*p && !isspace(*p)) ++p;              
        next_index=index+1;
    }   
    temp[index].index = -1;
    temp[index].value = 0;
    x[LineNumber] = temp;
}

I can give you a guarantee that I'm able to read the data successfully and the temp variable is always holding one feature vector of my scaled_train data.

But when I call

svm_cross_validation(&Test_Data,&param,7,target); 

I get a runtime access violation error.

I filled

  • Test_data.l = number of feature vector
  • Test_data.y = features label
  • Test_Data.x = features value

I don't know what's wrong here.

There's something odd in here too. When I try to read the value and index of my svm_node, I always get the last row of my scaled_data and I'm not able to see the whole data. (I guess the problem lies in here.)

for (int j = 0 ; j < 164000 ; j++)  //number of rows 
{
        for (int i = 0 ; i < 157 ; i++)   //maximum number of features 
            {
                    printf("The x[%d][%d] is %d   %lf",j,i,x[j][i].index,x[j][i].value); //I always get the last row for 16400 times !!!!!
                    getchar();
            }
}
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
PsP
  • 696
  • 3
  • 10
  • 34
  • This is a programming problem, even if your program ultimately has a computer science application. So I am migrating this question to [so]. – Gilles 'SO- stop being evil' May 17 '13 at 18:58
  • I think this question needs a lot of work. What is that Test_Data you're passing svm_cross_validation? How are you reading into it? You probably will need to take a step back and post a SSCCE (Short Self Contained Complete Example, http://sscce.org/) – carlosdc May 17 '13 at 19:04

1 Answers1

1

If your training data is in LIBSVM format (aka svmlight format), the easiest solution is to have a look at the routine LIBSVM uses to read models:

void read_problem(const char *filename);

as defined in svm-train.c in the LIBSVM package.

Marc Claesen
  • 16,778
  • 6
  • 27
  • 62
  • thanks , I did it , just one more question can you tell me how can I implement grid.py syntax which searches for the best c,g in c ? I want to implement all my project from training to testing in vs2010 and I did everthing except finding the best c,g , I just only know that I should use cross validation but I don't know how to set it's parameters (k-fold and etc. ...) – PsP May 18 '13 at 05:58
  • @PANAHI you probably should ask two new questions – Bull May 18 '13 at 13:30