I have a vector of labeled data elements, like this:
[label1: 1.1, label2: 2.43, label3: 0.5]
[label1: 0.1, label2: 2.0, label3: 1.0]
There can be any number of elements, where each element essentially corresponds to a row of data. I'm trying to parse this into a CSV with column headers, like this:
label1 label2 label3 1.1 2.43 0.5 0.1 2.0 1.0
I've been working with the StringBuilder()
constructor and would prefer to stick with it, but if necessary I can use something else.
I've almost got this working, except for separating the headers from the first row of numeric results.
I have an outer loop which traverses array elements ("rows") and an inner loop which traverses each piece of each array element ("columns"), where in the example above we have 2 "rows" (elements) and 3 "columns" (member indexes).
My code looks like this (the block below both creates the CSV and prints to the screen):
StringBuilder builder = new StringBuilder();
// Write predictions to file
for (int i = 0; i < labeled.size(); i++)
{
// Discreet prediction
double predictionIndex =
clf.classifyInstance(newTest.instance(i));
// Get the predicted class label from the predictionIndex.
String predictedClassLabel =
newTest.classAttribute().value((int) predictionIndex);
// Get the prediction probability distribution.
double[] predictionDistribution =
clf.distributionForInstance(newTest.instance(i));
// Print out the true predicted label, and the distribution
System.out.printf("%5d: predicted=%-10s, distribution=",
i, predictedClassLabel);
// Loop over all the prediction labels in the distribution.
for (int predictionDistributionIndex = 0;
predictionDistributionIndex < predictionDistribution.length;
predictionDistributionIndex++)
{
// Get this distribution index's class label.
String predictionDistributionIndexAsClassLabel =
newTest.classAttribute().value(
predictionDistributionIndex);
// Get the probability.
double predictionProbability =
predictionDistribution[predictionDistributionIndex];
System.out.printf("[%10s : %6.3f]",
predictionDistributionIndexAsClassLabel,
predictionProbability );
if(i == 0){
builder.append(predictionDistributionIndexAsClassLabel+",");
if(predictionDistributionIndex == predictionDistribution.length){
builder.append("\n");
}
}
// Add probabilities as rows
builder.append(predictionProbability+",");
}
System.out.printf("\n");
builder.append("\n");
}
The results currently come out like this:
setosa,1.0,versicolor,0.0,virginica,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
where setosa, versicolor, and virginica are the labels. As you can see it works starting on the second row, but I can't figure out how to fix the first row.