I'm working deeplearning4j and not understand how to get text paragraph of vector classified from the neural network.
I could only get classification rate.
This is my code:
public static void main(String[] args) throws Exception {
ClassPathResource resource = new ClassPathResource("paravec/recortes");
LabelAwareIterator iterator = new FileLabelAwareIterator.Builder()
.addSourceFolder(resource.getFile()).build();
TokenizerFactory t = new DefaultTokenizerFactory();
t.setTokenPreProcessor(new CommonPreprocessor());
ParagraphVectors paragraphVectors = new ParagraphVectors.Builder()
.learningRate(0.025).minLearningRate(0.001).batchSize(1000)
.epochs(10).iterate(iterator).trainWordVectors(true)
.tokenizerFactory(t).build();
paragraphVectors.fit();
ClassPathResource unlabeledResource = new ClassPathResource(
"paravec/caderno");
FileLabelAwareIterator unlabeledIterator = new FileLabelAwareIterator.Builder()
.addSourceFolder(unlabeledResource.getFile()).build();
MeansBuilder meansBuilder = new MeansBuilder(
(InMemoryLookupTable<VocabWord>) paragraphVectors
.getLookupTable(),
t);
LabelSeeker seeker = new LabelSeeker(iterator.getLabelsSource()
.getLabels(),
(InMemoryLookupTable<VocabWord>) paragraphVectors
.getLookupTable());
while (unlabeledIterator.hasNextDocument()) {
LabelledDocument document = unlabeledIterator.nextDocument();
//how to get text paragraph?
INDArray documentAsCentroid = meansBuilder
.documentAsVector(document);
}
}
Thanks! Renan.