1

I am trying to implement a channel for a CNN, this channel is to split a sentence into x number of parts. Each of these parts then gains a sentiment score, and the parts are fed into the CNN. I however don't understand how I can turn these part scores into an INDArray for the CNN.

My current code:

public INDarray getFeatureVectors(List<String> sentences) {
    // nParts is the number of parts to split the sentence into
    // sentences are a list of sentences in the current batch

    // int[] featureShape = new int[]{sentences.size(), nParts}; 
    int[] featureShape = new int[4];
    featureShape[0] = sentences.size();
    featureShape[1] = 1;
    featureShape[2] = nParts;
    featureShape[3] = nParts;
    INDArray features = Nd4j.create(sentences.size());
    for (int i = 0; i < sentences.size(); i++) {

        List<String> tokens = // tokenize sentence

        double[] partScores = // calculate the score for each part
                              // e.g. for nParts = 2, partScores = {-1.0, 1.0}

        INDArray vector = Nd4j.create(partScores, featureShape);
        INDArrayIndex[] indices = new INDArrayIndex[4];
        indices[0] = NDArrayIndex.point(i);
        indices[1] = NDArrayIndex.point(0);
        indices[2] = NDArrayIndex.all();
        indices[3] = NDArrayIndex.all();
        features.put(indices, vector);

    }
    return features;
}

I have just been experimenting with different feature shape and indices but I don't really have an idea what I'm doing, so any help would be greatly appreciated!

I am basing the code off Deeplearning4js CnnSentenceDataSetIterator, which turns sentences into word embeddings.

rj93
  • 523
  • 8
  • 25

0 Answers0