Suppose I want to represent an image of size H*W with 3 color channels (RGB) in a numpy 3-D array, such that the dimension is (H, W, 3). Let's take a simple example of (4,2,3). So we create an array like this - img = np.arange(24).reshape(4,2,3)
.
In order to fit the analogy of the above image example, the values of the elements should be -
Channel R: [0,1],[2,3],[4,5],[6,7]
Channel G: [8,9],[10,11],[12,13],[14,15]
Channel B: [16,17],[18,19],[20,21],[22,24]
i.e, 3 outer array, and above arrays nested inside.
However, the result of np.arange(24).reshape(4,2,3)
is -
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]]])
If I want the first row of first channel, i.e. img[0,:,0]
, I would expect [0,1] as result, but I will actually get [0,3] back.
I understand that if I initialize the ndarray with shape (3,4,2), I will get what I am looking for. But I want to work with the conventional shape of (H,W,depth).
Can you please help me understand the gap in my understanding?