I just started to work on the Pascal VOC segmentation dataset. But I have trouble understanding the colour coding they have used in the ground labeling. I assumed pixels would be annotated 1 through 20 for each class but what I have got are 8 bit deep png images with pixel values (0-255).
For a certain pixel belonging to aeroplane
class in 2007_000033.png
, I get the values: (128, 0, 0); while another pixel belonging to train
class in 2007_000123.png
, gives the values : (128, 0, 192) and so on.
How do I differentiate them in different classes and do a one-hot encoding? Do I need to specify pixel values for each class (like searching pixels with (128, 0, 0) and encode them as 1 for class aeroplane
)?
Sorry, I see a few similar questions on SO but nothing helped me. Thanks.