0

In channels_last format, the shape of the data tensor is (batch_size, height, width, channels) and the shape of the weight tensor is apparently (see reference 2) (rows, cols, input_depth, output_depth).

In channels_first format, the shape of the data tensor is (batch_size, channels, height, width) and the shape of the weight tensor is what?

I've looked high and low for the answer to that question. When I run my code and use model.get_weights() to get the weight and bias tensors, it appears that the format of the weight tensors is the same in channels_first as in channels_last. Yet, when I output the weight tensors to a file and read them back into my C/C++ code which is hand-crafted and doesn't use TensorFlow, it doesn't appear to be working. The results are numerically nonsensical. Maybe there is some other problem, but I would like to obtain a definitive answer to this question.

BTW, the reason I'm switching between channels_last and channels_first is that I need to be able to develop my code on a CPU machine and then run large training sessions on a GPU machine.

Any help is appreciated.

References:

Data tensor shape is explained here.

Weight tensor shape is partially explained here.

1 Answers1

1

You can find the answer in source code of TF/keras keras/keras/layers/convolutional/base_conv.py, where data_format=channels_first or data_format=channels_last is working when forward calculation, but in weight definition, the kernel shape is kept as:

kernel_shape = self.kernel_size + (input_channel // self.groups, self.filters)

So, it makes you find the weight format is same in channels_first or channels_last by model.get_weights()

In detail, convolution op is ultimately performed by conv1d, conv2d, conv3d, etc., in gen_nn_ops which defined and conducted by C/C++. Each of these operation need receive data_format to adjust inputs but not kernels (weights/filters).

Little Train
  • 707
  • 2
  • 11
  • Thank you for your answer though I'm not sure I really understand the explanation as I don't understand the code. What is the dimension of kernel_shape? It appears to be 2-D. But, the weight tensor is 4-D, so what is the relationship of the kernel_shape to the weight tensor? – Darrell Hougen Jun 21 '22 at 21:33
  • 1
    @DarrellHougen `kernel_shape` is `weight tensor's shape`. Consider `kernel_size=(3,3), input_channel=3, gropus=1, filters=64`, the `kernel_size=(3,3)+(1,64)=(3,3,1,64)`, i.e., `[rows(height), cols(width), input_depth, output_depth]` that you have mentioned. – Little Train Jun 22 '22 at 03:14