Why Convolution function in MXnet have a kernel parameter

Question

I am new of mxnet, in the official doc, the generation of a convolution layer could be

conv = nd.Convolution(data=data, weight=W, bias=b, kernel=(3,3), num_filter=10)

But it is required that the weight parameter needs to take a 4-D tensor

W = [weight_num, stride, kernel_height, kernel_width]

So why we still need to set a kernel parameter in Convolution function?

score 1 · Accepted Answer · answered Aug 07 '18 at 19:24

kernel parameter sets up the kernel size, which can be either:

(width,) - for 1D convolution
(height, width) - for 2D convolution
(depth, height, width) - for 3D convolution

It only defines shapes.

The weight and bias parameters contain actual parameters that are going to be trained. The actual values are going to be here.

While you could probably figure out kernel (shapes) by provided weight, it is more defensive to ask to provide kernel shape explicitly instead of trying figuring it out based on parameters passed to weight.

Here is an example of 2D convolution:

# shape is batch_size x channels x height x width
x = mx.nd.random.uniform(shape=(100, 1, 9, 9))
# kernel is just 3 x 3, 
# weight is num_filter x channels x kernel_height x kernel_width
# bias is num_filter
mx.nd.Convolution(data=x, 
                  kernel=(3, 3), 
                  num_filter=5, 
                  weight=mx.nd.random.uniform(shape=(5, 1, 3, 3)), 
                  bias=mx.nd.random.uniform(shape=(5,)))

The documentation explaining various shapes of parameters in case of 1D, 2D or 3D convolutions is quite good: https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.Convolution

Why Convolution function in MXnet have a kernel parameter

1 Answers1