2

I am working on a multiclass semantic segmentation dataset, the dataset has RGB ground truth segmentation masks for the original images. The dataset has 24 classes. The following table displays the classes and their respective RGB values:

name r g b
unlabeled 0 0 0
paved-area 128 64 128
dirt 130 76 0
grass 0 102 0
gravel 112 103 87
water 28 42 168
rocks 48 41 30
pool 0 50 89
vegetation 107 142 35
roof 70 70 70
wall 102 102 156
window 254 228 12
door 254 148 12
fence 190 153 153
fence-pole 153 153 153
person 255 22 96
dog 102 51 0
car 9 143 150
bicycle 119 11 32
tree 51 51 0
bald-tree 190 250 190
ar-marker 112 150 146
obstacle 2 135 115
conflicting 255 0 0

Sample RGB Ground Truth Segmentation Mask Image:

Sample RGB Ground Truth Segmentation Mask Image

There are 400 images in the dataset, each having a shape of (4000 px X 6000 px). The directory structure of the dataset is shown below:

dataset_folder
├── original_images
│   ├── 000.png
│   ├── 001.png
│   ├── ...
|   ├── 399.png
|   └── 400.png
└── masks
    ├── 000.png
    ├── 001.png
    ├── ...
    ├── 399.png
    └── 400.png

I want to create semantic segmentation masks from the RGB masks, by assigning integer values to the pixels in the range 0-23 (where each integer represents a class) and save them to the working directory. Can someone please suggest an efficient code for this task?

Ayush
  • 29
  • 4
  • Hi I tried using the code I found here: [https://www.bulentsiyah.com/preprocessing-rgb-image-masks-to-segmentation-masks] maybe it work for you. – Paul Feb 10 '21 at 16:33

1 Answers1

0

I had a similar problem. My solution is probably not the most efficient, but as there is no other answer, i share it anyway :

First get an array from the image, opening it with openCV for example..

For the example, let's make an "image" of 4*3 px with three channels:

img=np.array([[
    [128, 64,128],
    [  0,  0,  0],
    [  0,  0,  0],
    [  0,  0,  0]],
   [[128, 64,128],
    [  0,102,  0],
    [  0,  0,  0],
    [  0,  0,  0]],
   [[130, 76,  0],
    [130, 76,  0],
    [130, 76,  0],
    [130, 76,  0]]])

chanelimg

Make a dictionary of the RGB values associated with the mask's wanted value (i wrote it down by hand for the example, but you can do it using pandas if you have a table as shown above), then make a list of the values encountered in the image, and finally create the mask with the corresponding categorical value.

unlabeled = str([0, 0, 0])
paved_area = str([128,  64, 128])
dirt = str([130,  76,   0])
grass = str([  0, 102,   0])

labels = {unlabeled:0, paved_area:1, dirt:2, grass:3}

print(labels)
>>> {'[0, 0, 0]': 0, '[128, 64, 128]': 1, '[130, 76, 0]': 2, '[0, 102, 0]': 3}

width = img.shape[1]
height = img.shape[0]

values = [str(list(img[i,j])) for i in range(height) for j in range(width)]
print(values)
>>> ['[128, 64, 128]', '[0, 0, 0]', ..., '[130, 76, 0]']
print(len(values))
>>> 12      # width*height 

mask=list([0]*width*height)
for i, value in enumerate(values):
    mask[i]=labels[value]

mask = np.asarray(mask).reshape(height,width)

print(mask)
>>> array([[1, 0, 0, 0],
           [1, 3, 0, 0],
           [2, 2, 2, 2]])

mask

Lue Mar
  • 442
  • 7
  • 10