0

I have a training data which has one instance with class label "yes" and I want to remove this instance. ok, I removed this instance but I could not know how to save the new training data because I want to use it out of the for loop. I use the following code

        Dataset[] folds = data.folds((10), new Random(100));
        Dataset training = new DefaultDataset();    //training, testing
        Dataset testing = new DefaultDataset();
       int[] tr = {0, 2, 3, 5, 7, 8, 9};
        int[] te = {1, 4, 6};   
        for (int i = 0; i < 7; i++) {
            training.addAll(folds[tr[i]]);
        }
        for (int i = 0; i < 3; i++) {
            testing.addAll(folds[te[i]]);
        }
       int numFolds = 10;
        Dataset[] foldsTrain = training.folds(numFolds, new Random(1));
    for (int i=0; i<56; ++i)
    {
    if (!training.instance(i).classValue().equals("yes"))
    {
    System.out.println("the new training data"+ training.instance(i))
    }
    }

Thank All

  • Do you want to use the Dataset with removed features named as yes without using a loop? – JAMSHAID Nov 02 '19 at 04:12
  • yes, for example I want to calculate the accuracy of the new training data after I removed the instance which has a class "Yes". I know how to calculate it but how can I get the dataset out of the loop – user11801243 Nov 02 '19 at 04:17
  • you'll need to iterate the dataset to remove the yes column. As you remove it, store the removed instance in a new dataset and you'll have a complete set at the end of the loop. You can use it outside the loop. – JAMSHAID Nov 02 '19 at 04:32
  • update the dataset in your `if` block – JAMSHAID Nov 02 '19 at 04:33
  • but the "yes" word is a class label of an instance (row)in the training data, not a column and I already remove it . Now I have Training.instance(i) without the removed instance and I want to use it out of the loop. How? – user11801243 Nov 02 '19 at 04:44
  • why don't you try storing it in a new dataset array? and use that array. Either convert it into string or whatever the requirements are – JAMSHAID Nov 02 '19 at 04:57
  • I tried to store the Training.instance(i) in a array but the type of data set as it contain instances cant be acceptable for the array list even string or others – user11801243 Nov 02 '19 at 05:03
  • https://stackoverflow.com/questions/42389203/how-to-convert-the-datasets-of-spark-row-into-string you should have a look at this question, I think – JAMSHAID Nov 02 '19 at 05:16

0 Answers0