0

I have a vector of labeled data elements, like this:

[label1: 1.1, label2: 2.43, label3: 0.5]

[label1: 0.1, label2: 2.0, label3: 1.0]

There can be any number of elements, where each element essentially corresponds to a row of data. I'm trying to parse this into a CSV with column headers, like this:

label1 label2 label3
1.1    2.43   0.5
0.1    2.0    1.0

I've been working with the StringBuilder() constructor and would prefer to stick with it, but if necessary I can use something else.

I've almost got this working, except for separating the headers from the first row of numeric results.

I have an outer loop which traverses array elements ("rows") and an inner loop which traverses each piece of each array element ("columns"), where in the example above we have 2 "rows" (elements) and 3 "columns" (member indexes).

My code looks like this (the block below both creates the CSV and prints to the screen):

StringBuilder builder  = new StringBuilder();

// Write predictions to file
for (int i = 0; i < labeled.size(); i++)      
{
    // Discreet prediction
    double predictionIndex = 
        clf.classifyInstance(newTest.instance(i)); 

    // Get the predicted class label from the predictionIndex.
    String predictedClassLabel =
        newTest.classAttribute().value((int) predictionIndex);

    // Get the prediction probability distribution.
    double[] predictionDistribution = 
        clf.distributionForInstance(newTest.instance(i)); 

    // Print out the true predicted label, and the distribution
    System.out.printf("%5d: predicted=%-10s, distribution=", 
                      i, predictedClassLabel); 

    // Loop over all the prediction labels in the distribution.
    for (int predictionDistributionIndex = 0; 
         predictionDistributionIndex < predictionDistribution.length; 
         predictionDistributionIndex++)
    {
        // Get this distribution index's class label.
        String predictionDistributionIndexAsClassLabel = 
            newTest.classAttribute().value(
                predictionDistributionIndex);

        // Get the probability.
        double predictionProbability = 
            predictionDistribution[predictionDistributionIndex];

        System.out.printf("[%10s : %6.3f]", 
                          predictionDistributionIndexAsClassLabel, 
                          predictionProbability );
        if(i == 0){
            builder.append(predictionDistributionIndexAsClassLabel+",");

            if(predictionDistributionIndex == predictionDistribution.length){
                builder.append("\n");
            }
        }
        // Add probabilities as rows     
        builder.append(predictionProbability+",");

        }

    System.out.printf("\n");
    builder.append("\n");

}

The results currently come out like this:

setosa,1.0,versicolor,0.0,virginica,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,

where setosa, versicolor, and virginica are the labels. As you can see it works starting on the second row, but I can't figure out how to fix the first row.

Hack-R
  • 22,422
  • 14
  • 75
  • 131

1 Answers1

1

If I understand your question correctly, you are getting labels as well as values for the first row simultaneously in the inside for loop and hence appending as they come. If you want to separate labels out, you could do some changes to the inner loop part as below :

StringBuilder labelRow = new StringBuilder();

    // Loop over all the prediction labels in the distribution.
    for (int predictionDistributionIndex = 0; 
         predictionDistributionIndex < predictionDistribution.length; 
         predictionDistributionIndex++)
    {
        // Get this distribution index's class label.
        String predictionDistributionIndexAsClassLabel = 
            newTest.classAttribute().value(
                predictionDistributionIndex);

        // Get the probability.
        double predictionProbability = 
            predictionDistribution[predictionDistributionIndex];

        System.out.printf("[%10s : %6.3f]", 
                          predictionDistributionIndexAsClassLabel, 
                          predictionProbability );
        if(i == 0){
            labelRow.append(predictionDistributionIndexAsClassLabel+",");

            if(predictionDistributionIndex == predictionDistribution.length){
                builder.append("\n");
            }

        }

        // Add probabilities as rows     
        builder.append(predictionProbability+",");

     }
     if(i == 0){
          builder.insert(0,labelRow.toString()+"\n");
     }

What it does is it collects the labels in a separate StringBuilder, later you can insert it at the beginning of final builder value.

Hack-R
  • 22,422
  • 14
  • 75
  • 131
SomeDude
  • 13,876
  • 5
  • 21
  • 44