0

I am a newbie in CNN and I want to ask what does the channels do in SSD for example? For what reason they exist? For example 18X18X1024 (third number)?

Thanks for any answer.

1 Answers1

0

The dimensions of an image can be represented using 3 numbers. For example, a color image in CIFAR-10 dataset has a height of 32 pixels, width of 32 pixels and is represented as 32 x 32 x 3. Here 3 represents the number of channels in your image. Color images have a channel size of 3 (usually RGB), while a grayscale image will have a channel size of 1.

A CNN will learn features of the images that you feed it, with increasing levels of complexity. These features are represented by the channels. The deeper you go into the network, the more channels you will have that represents these complex features. These features are then used by the network to perform object detection.

In your example, 18X18X1024 means your input image is now represented with 1024 channels, where each channel represents some complex feature/information about the image.

Since you are a beginner, I suggest you look into how CNNs work in general, before diving into object detection. A good start would be image classification using CNNs. I hope this answers your question. Happy learning!! :)