Explain the intuition behind a two dimensional image with multiple channels in computer vision

Question

Consider a 2D image of dimensions 46*46*3 (length x height X no of channels). It means the image is of the specified length and height with three channels - red, blue & green.

I have then come across an image of dimensions 20*20*32. What does it mean to have 32 channels? Is it right if I say it is an image with 32 channels representing 32 colors?

score 2 · Answer 1 · answered Nov 22 '17 at 19:00

You are on the right track. Typically you see three channels representing RGB. But you can have 4 channels, adding one for the alpha so RGBa. The alpha number says how transparent that particular pixel is supposed to be when it is combined in layers with other images.

Images collected by satellite sensors can contain hundreds of channels per pixel, with each channel being a number that represents some special quality collected by that sensor. Channels might code the height of that pixel, the temperature as seen from space or the reflectance as seen in various spectral bands invisible to human eyes. Since computer monitors only have three, RGB elements for each pixel, showing such multi-channel images requires software that maps the many channels into only three for display purposes. Such multi-channel images are often displayed using "false color" techniques that map one or more channel values into a specific range of RGB values.

For more information and where to find where I got this answer check out: http://www.georeference.org/doc/images_and_channels.htm

That link has all the info I need for now. It helped. – RPM Nov 22 '17 at 22:29 — RPM, Nov 22 '17 at 22:29

score 1 · Answer 2 · answered Nov 22 '17 at 22:06

1

While the Randall's answer is usually true for common images where a few channels represent information like red, green, blue, depth, transparency, temperature and so on another very common use of multi channel images is simply having many images in a single data structure.

So in a 20x20x32 "image" you can store 32 20x20 images. Especially in CNNs which you have tagged you have several layers with many many small images of same dimension.

Another typical use is a so called image stack in microscopy or some 3d imaging technologies where you store many images that were taken at different heights.

answered Nov 22 '17 at 22:06

Piglet

27,501
3
20
43

If an image of 20*20*32 after going through a convolutional layer (say 45 2*2*32 filters,same padding), the new dimension is 20*20*45. So this 45 pertains to 45 stacked small images of the dimensions 20*20? What does 32 in the input image represent? Should we specify what that third dimension represents - colors or small images like you said? – RPM Nov 22 '17 at 22:42
can you show me an example where you have a 32 channel image as an input to a CNN? the only person that can tell you what 32 channels represent in a 20x20x32 image is the person who created it. colours wouldn't make any sense to me. – Piglet Nov 23 '17 at 06:34
Like you said, it is not input image(am pursuing a MOOC). That dimension (32) came after applying filters. I was wondering how to understand an output like 28*28*192 after applying conv filter. You said there will be 192 small images of same kind here. Is it like these each of these 192 images have one learnt feature? Sorry if my question is stupid. – RPM Nov 23 '17 at 22:02
it's just the output of the previous layer. maybe just visualize them and see what they contain? here's a simple example. you can draw a digit in the upper left corner and see the layers http://scs.ryerson.ca/~aharley/vis/conv/flat.html – Piglet Nov 24 '17 at 05:17
Thanks for the help. I will go through it. – RPM Nov 24 '17 at 22:35

Explain the intuition behind a two dimensional image with multiple channels in computer vision

2 Answers2