I am going through some CNN articles. I see that they transform the input image to (channel, width, height)
.
A code example taken from MXNET CNN Tutorial.
def transform(data, label):
# 2,0,1 means channels,width, height
return nd.transpose(data.astype(np.float32), (2,0,1))/255, label.astype(np.float32)
Can any one explain why do we do this transformation?