In channels_last
format, the shape of the data tensor is (batch_size, height, width, channels)
and the shape of the weight tensor is apparently (see reference 2) (rows, cols, input_depth, output_depth)
.
In channels_first
format, the shape of the data tensor is (batch_size, channels, height, width)
and the shape of the weight tensor is what?
I've looked high and low for the answer to that question. When I run my code and use model.get_weights()
to get the weight and bias tensors, it appears that the format of the weight tensors is the same in channels_first
as in channels_last
. Yet, when I output the weight tensors to a file and read them back into my C/C++ code which is hand-crafted and doesn't use TensorFlow, it doesn't appear to be working. The results are numerically nonsensical. Maybe there is some other problem, but I would like to obtain a definitive answer to this question.
BTW, the reason I'm switching between channels_last
and channels_first
is that I need to be able to develop my code on a CPU machine and then run large training sessions on a GPU machine.
Any help is appreciated.
References: