Mask for neural network training

Question

I am training a neural network and I found a dataset of pictures taken by drones. It says that it has 20 classes that should be detected. However the masks all looks like this (I'm sorry the image is really dark!): This is my mask

When I try to train the network it always says that the image only has 3 channels. (I figured it was probably RGB). The problem is that my network expects a 20 channels input for the mask (one for each category to detect like tree, car, human, etc.) Is there a way for me to transform the image in a 20 channels image? I looked if the mask only had 20 different values for the pixel colors but it has 254 so I do not think there is something to do with that...

Thank you! (It is my first question on StackOverflow so if there is a problem in the question just tell me! :-) )

prayingMantis · Accepted Answer · 2020-06-29T19:50:49.063

1

I am not sure I understand your question correctly but yes, there is a way to upsample the image to make it a 20 channel input. All you have to do is take the original image (lets assume its size is [batch, height, width, #channels]) and convolve it with a kernel ([#channels, height, width, 20]) while keeping the padding mode 'same'. This would convert your 3 channel image into a 20 channel array (wouldn't call it an image anymore).

edited Jun 29 '20 at 19:50

answered Jun 29 '20 at 19:06

prayingMantis

159
5

Hi, thanks for your answer. If I upsample my mask to a 20 channel input it will not split my mask into the differents categories. I think this might be a problem? I was thinking the first channel is like the trees, the second channel is the cars, the third is the humans, etc. Because when I have only one class to detect it is really easy because the mask is compose of 1 and 0. However, here I have 20 class. I do not think it will work if I just take my mask and upsample it to 20 channels just by doing a convolution. I might be wrong so please correct me if it is the case. – Felix Gauthier Jun 29 '20 at 19:25
I see, your solution should be to create a separate mask for each object in the dataset. All you have to do is take the aggregate (joint) mask and split it for each object while keeping the size of each mask same as the original. I hope this makes it clear. – prayingMantis Jun 29 '20 at 19:53
Yes I think i figure out my mistake. I was doing a resize with interpolation on my sample and it changed my value. If I do not do the resize I only get 20 unique value. I'll will first cast my mask in dummy variables and I will be okay! Thanks – Felix Gauthier Jun 30 '20 at 01:08

Mask for neural network training

1 Answers1