I am doing text classification using CNN.
Let’s say features are a 6×3 matrix and the kernel dimension is 2×3, stride = 1, without padding we will get an output vector 1×5.
Now if we add padding to make the output vector 1*6, I get two types of output vectors depending upon the padding location. If I add padding at the beginning of Word2vec embedding matrix I get a vector, like
If I add padding at end of the word2vec embedding matrix I get a vector like
The problem is with maxpooling.
The highest value 0.61 represents two different words for two different type of padding.
What should I do to make the maxpooling a unique word?