How to pre-process RGB segmentation mask for multi-class semantic segmentation?

Question

I am working on a multiclass semantic segmentation dataset, the dataset has RGB ground truth segmentation masks for the original images. The dataset has 24 classes. The following table displays the classes and their respective RGB values:

name	r	g	b
unlabeled	0	0	0
paved-area	128	64	128
dirt	130	76	0
grass	0	102	0
gravel	112	103	87
water	28	42	168
rocks	48	41	30
pool	0	50	89
vegetation	107	142	35
roof	70	70	70
wall	102	102	156
window	254	228	12
door	254	148	12
fence	190	153	153
fence-pole	153	153	153
person	255	22	96
dog	102	51	0
car	9	143	150
bicycle	119	11	32
tree	51	51	0
bald-tree	190	250	190
ar-marker	112	150	146
obstacle	2	135	115
conflicting	255	0	0

Sample RGB Ground Truth Segmentation Mask Image:

Sample RGB Ground Truth Segmentation Mask Image

There are 400 images in the dataset, each having a shape of (4000 px X 6000 px). The directory structure of the dataset is shown below:

dataset_folder
├── original_images
│   ├── 000.png
│   ├── 001.png
│   ├── ...
|   ├── 399.png
|   └── 400.png
└── masks
    ├── 000.png
    ├── 001.png
    ├── ...
    ├── 399.png
    └── 400.png

I want to create semantic segmentation masks from the RGB masks, by assigning integer values to the pixels in the range 0-23 (where each integer represents a class) and save them to the working directory. Can someone please suggest an efficient code for this task?

Hi I tried using the code I found here: [https://www.bulentsiyah.com/preprocessing-rgb-image-masks-to-segmentation-masks] maybe it work for you. — Paul, Feb 10 '21 at 16:33

score 0 · Answer 1 · answered Oct 06 '21 at 08:50

I had a similar problem. My solution is probably not the most efficient, but as there is no other answer, i share it anyway :

First get an array from the image, opening it with openCV for example..

For the example, let's make an "image" of 4*3 px with three channels:

img=np.array([[
    [128, 64,128],
    [  0,  0,  0],
    [  0,  0,  0],
    [  0,  0,  0]],
   [[128, 64,128],
    [  0,102,  0],
    [  0,  0,  0],
    [  0,  0,  0]],
   [[130, 76,  0],
    [130, 76,  0],
    [130, 76,  0],
    [130, 76,  0]]])

Make a dictionary of the RGB values associated with the mask's wanted value (i wrote it down by hand for the example, but you can do it using pandas if you have a table as shown above), then make a list of the values encountered in the image, and finally create the mask with the corresponding categorical value.

unlabeled = str([0, 0, 0])
paved_area = str([128,  64, 128])
dirt = str([130,  76,   0])
grass = str([  0, 102,   0])

labels = {unlabeled:0, paved_area:1, dirt:2, grass:3}

print(labels)
>>> {'[0, 0, 0]': 0, '[128, 64, 128]': 1, '[130, 76, 0]': 2, '[0, 102, 0]': 3}

width = img.shape[1]
height = img.shape[0]

values = [str(list(img[i,j])) for i in range(height) for j in range(width)]
print(values)
>>> ['[128, 64, 128]', '[0, 0, 0]', ..., '[130, 76, 0]']
print(len(values))
>>> 12      # width*height 

mask=list([0]*width*height)
for i, value in enumerate(values):
    mask[i]=labels[value]

mask = np.asarray(mask).reshape(height,width)

print(mask)
>>> array([[1, 0, 0, 0],
           [1, 3, 0, 0],
           [2, 2, 2, 2]])

How to pre-process RGB segmentation mask for multi-class semantic segmentation?

1 Answers1