In most tensorflow tutorials authors use channel-last dimension ordering, e.g.
input_layer = tf.reshape(features, [-1, 28, 28, 1])
where the last digit represents the number of channels (https://www.tensorflow.org/tutorials/layers). Being used to Theano and Numpy (both use C-ordering, i.e. row-major), I find this awkward. Moreover, having read the documentation on in-memory layout schemes in tensorflow, I reckon channel-last layout will cause more cache-misses, because convolutions are carried out on individual channels, while in channel-last ordering these channels are intermixed in linear memory, effectively shrinking the cache by N (where N is the number of channels), which is especially inefficient in 3D and 4D convolutions. Am I getting something wrong?
P.S.
I've found a closely-related thread (Tensorflow 3 channel order of color inputs). The author of the accepted answer states that TF uses row-major by default, but given that all of the tutorials I've found so far show channel-last ordering I find that claim misleading.