Suppose I have two feature maps F1 and F2 output by a network. I want to compute convolution of F1 and F2. Assume that F1 has shape (1, C, 10, 10) and F2 has shape (1, C, 3, 3) and the wanted result should have shape (1, 1, 8, 8) if pad = 0, stride = 1 and dilate = 1. But, in this way I can only set batchsize to 1 because the kernel of Convolution layer is irrelevant of batchsize, so I cannot set the weights with a batch of output data.
How to implement this using MXNet?
I have come up with one possible way that uses mx.sym.Correlation, but I cannot get the idea how correlation operator computes by reading the doc. Or, can I set the weight of a mx.sym.Convolution layer to F2, and data to F1? Would this interfere the propagation of grads when training?
[Update] What I want to do is like the following example:
By correlation, I mean F2 acts like a correlation kernel (or convolution kernel) that slides on F1. For example,
1 1 1 2 2
F1 = 2 3 4 1 1
0 0 0 2 3
0 1 0
F2 = 1 0 1
0 1 0
Then, the correlation result should be
R = F1 * F2 = 7 5 9
where
1 1 1 0 1 0
7 = 2 3 4 x 1 0 1 = 1 + 2 + 4 + 0
0 0 0 0 1 0
1 1 2 0 1 0
5 = 3 4 1 x 1 0 1 = 1 + 3 + 1 + 0
0 0 2 0 1 0
1 2 2 0 1 0
9 = 4 1 1 x 1 0 1 = 2 + 4 + 1 + 2
0 2 3 0 1 0
In the above example, stride = 1, pad = 0, dilate = 0