0

I would like to print the labels of traindata / testdata used in classification. Here is the definition of both inputs (using deep4j).

    InputSplit[] inputSplit = fileSplit.sample(pathFilter, splitTrainTest, 1 - splitTrainTest);
    InputSplit trainData = inputSplit[0];
    InputSplit testData = inputSplit[1];

that are then transformed in DataSetIterator like this :

    ImageRecordReader recordReader = new ImageRecordReader(height, width, channels, labelMaker);
    recordReader.initialize(trainData, null);
    trainIter = new RecordReaderDataSetIterator(recordReader, batchSize, 1, numLabels);

Then I want to print how many examples per labels where found in each iterator in this function :

public void print(DataSetIterator iter){

    HashMap<String, Integer> hash = new HashMap<String, Integer>();

    while(iter.hasNext()){
        DataSet example = iter.next();
        for(int i = 0 ; i<numLabels ; i++){
            if(example.getLabels().getDouble(i)==1.){
                String label = example.getLabelName(i);
                if(hash.containsKey(label))
                    hash.put(label, hash.get(label)+1);
                else
                    hash.put(label, 1);
            }
        }
    }

    for (String label: hash.keySet()){
        System.out.println("   label : " + label.toString() + ", " + hash.get(label) + " examples");
    }
}

The issue is that it displays only one example per label, whereas there should much more... And when I don't split my dataset using fileSplit.sample() the function displays the right number of examples. Any suggestion ?

Mahesh Khond
  • 1,297
  • 1
  • 14
  • 31
Arcyno
  • 4,153
  • 3
  • 34
  • 52

1 Answers1

0

If you use a dataset you can use the toString() of the dataset.getFeatureMatrix() and dataset.getLabels()

If you want to print just the label counts, you can use dataset.labelCounts() I would look more at the dl4j javadoc: http://deeplearning4j.org/doc

Adam Gibson
  • 3,055
  • 1
  • 10
  • 12
  • I would like to retrieve every single example and its corresponding label. So my question is : does a dataset contains several examples ? (it does not seem to be the case for me because .labelCounts() sends back only one value) – Arcyno Nov 11 '16 at 15:34