7

I am very new to image recognition with CNNs and currently using several standard (pre-trained) architectures available within Keras (VGG and ResNet) for image classification tasks. I am wondering how one can generalise the number of input channels to more than 3 (instead of standard RGB). For example, I have an image which was taken through 5 different (optic) filters and I am thinking about passing these 5 images to the network.

So, conceptually, I need to pass as an input (Height, Width, Depth) = (28, 28, 5), where 28x28 is the image size and 5 - the number of channels.

Any easy way to do it with ResNet or VGG please?

Arnold Klein
  • 2,956
  • 10
  • 31
  • 60
  • Are the channels RGB + 2 more (e.g. alpha) or are they a different encoding altogether? Is there a way of converting your 5 channel format to RGB? – Djib2011 Aug 27 '18 at 21:30
  • @Djib2011, good point! I am sure it is possible, however I’m wondering whether it is possible to pass them directly as they are. For example, instead of RGB you use CMYK scheme (or any other). – Arnold Klein Aug 27 '18 at 21:33
  • 1
    If I were to guess, I'd say **no**. If you want you can look at [this](https://stackoverflow.com/questions/51995977/how-can-i-use-a-pre-trained-neural-network-with-grayscale-images/51996037#51996037) answer, where I discussed about inputting grayscale images to pre-trained networks. The OP asked if it was possible to replace the input layer so that it can accept a different number of channels. I feel that this is impossible, because the subsequent layers have learned to extract features derived from this layer that is about to be removed. – Djib2011 Aug 27 '18 at 21:44
  • @Djib2011 However in the link you provided the 2nd answer (not the accepted one) actually explains and shows that it is possible. However they are talking about a channel reduction instead of a promotion. Would be interesting if it would work somehow with increasing the channels. – Kaschi14 Sep 22 '22 at 09:14

1 Answers1

2

If you retrain the models, that's not a problem. Only if you want to use a trained model, you have to keep the input the same.

Simdi
  • 794
  • 4
  • 13