I am trying to implement a channel for a CNN, this channel is to split a sentence into x number of parts. Each of these parts then gains a sentiment score, and the parts are fed into the CNN. I however don't understand how I can turn these part scores into an INDArray for the CNN.
My current code:
public INDarray getFeatureVectors(List<String> sentences) {
// nParts is the number of parts to split the sentence into
// sentences are a list of sentences in the current batch
// int[] featureShape = new int[]{sentences.size(), nParts};
int[] featureShape = new int[4];
featureShape[0] = sentences.size();
featureShape[1] = 1;
featureShape[2] = nParts;
featureShape[3] = nParts;
INDArray features = Nd4j.create(sentences.size());
for (int i = 0; i < sentences.size(); i++) {
List<String> tokens = // tokenize sentence
double[] partScores = // calculate the score for each part
// e.g. for nParts = 2, partScores = {-1.0, 1.0}
INDArray vector = Nd4j.create(partScores, featureShape);
INDArrayIndex[] indices = new INDArrayIndex[4];
indices[0] = NDArrayIndex.point(i);
indices[1] = NDArrayIndex.point(0);
indices[2] = NDArrayIndex.all();
indices[3] = NDArrayIndex.all();
features.put(indices, vector);
}
return features;
}
I have just been experimenting with different feature shape and indices but I don't really have an idea what I'm doing, so any help would be greatly appreciated!
I am basing the code off Deeplearning4js CnnSentenceDataSetIterator, which turns sentences into word embeddings.