How to write a good masks script for semantic segmentation?

Question

I have made annotations in JSON to train Deeplabv3plus on my custom dataset, but I do have the problem with masking script. No matter what I do the borders of my masks are aliased (blurred) so there are many new unique pixel values. I want to only draw masks with colors given in classes dict. How to fix this?

import os
import json
import cv2
import numpy as np
from collections import defaultdict

classes = {'cat': (0, 255, 0),
           'dog': (255, 0, 0),
           'duck': (0, 0, 255),
           'bunny': (255, 255, 0),
           'trash': (255, 0, 255),
           'tire': (0, 255, 255),
           'factory': (128, 0, 0),
           'car': (0, 128, 0),
           'human': (0, 0, 128),
           'deer': (128, 128, 0),
           'wolf': (128, 0, 128),
           'zombie': (0, 128, 128),
           'pig': (192, 0, 0),
           'head': (0, 192, 0),
          # 'guinea': (0, 0, 192),
           'fire': (192, 192, 0),
         #  'watermelon': (192, 0, 192),
           'smoke': (0, 192, 192),
          # 'ruins': (128, 64, 0),
           'hanged_man': (0, 128, 64),
           'free_fall': (64, 128, 0),
           'pennywise': (128, 0, 64),
           'skull': (0, 64, 128),
           'barn': (255, 128, 0)
           }



def work_on(data):
    data = data["_via_img_metadata"]
    used_classes = {}  
    for key, value in data.items():
        filename = value["filename"]
        print("proccesing", filename)
        img_path = f"{img_dir}/{filename}"
        img = cv2.imread(img_path, cv2.IMREAD_COLOR)
        h, w, _ = img.shape
        mask = np.zeros((h, w, 3), dtype=np.uint8)  # Initialize mask to black

        # Create a dictionary to store binary masks for each class
        binary_masks = defaultdict(lambda: np.zeros((h, w), dtype=np.uint8))

        regions = value["regions"]
        for region in regions:
            shape_attributes = region["shape_attributes"]
            shape_type = shape_attributes["name"]
            if shape_type == "polygon":
                x_points = shape_attributes["all_points_x"]
                y_points = shape_attributes["all_points_y"]
            else:
                print("we are working with polygon annotations!")
            region_attributes = region["region_attributes"]
            class_type = region_attributes["name"].strip()

            # Set the color based on the class_type, or use a default color if the class is not in the classes dictionary
            color = classes.get(class_type, (255, 255, 255))  

            contours = []

            for x, y in zip(x_points, y_points):
                contours.append((x, y))
            contours = np.array(contours)

            # Draw filled polygon on the binary mask for the current class
            cv2.fillPoly(binary_masks[class_type], [contours], 255)


            # If class_type not already in used_classes, add it
            if class_type not in used_classes:
                used_classes[class_type] = color


            for class_type, color in classes.items():
                binary_mask = binary_masks[class_type]

                # Apply a threshold to remove color artifacts
                _, binary_mask = cv2.threshold(binary_mask, 127, 255, cv2.THRESH_BINARY)

                mask[binary_mask == 255] = color


        cv2.imwrite(f"{msk_dir}/{filename}", mask)

        # Create the overlay
        overlay = cv2.addWeighted(img, 0.5, mask, 0.5, 0)

        # Draw the legend on the overlay
        legend_start_x = w - 200  # start 200 pixels from the right
        legend_start_y = 20  # start 20 pixels from the top
        legend_gap = 30  # 30 pixels gap between each legend item
        for i, (class_type, color) in enumerate(used_classes.items()):
            legend_y = legend_start_y + i * legend_gap
            cv2.rectangle(overlay, (legend_start_x, legend_y), (legend_start_x + 20, legend_y + 20), color, -1)  # draw a color square
            cv2.putText(overlay, class_type, (legend_start_x + 30, legend_y + 15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)  # draw the class name

        cv2.imwrite(f"{overlay_dir}/{filename}", overlay)



if __name__ == '__main__':

    json_folder = 'DATASET_TRAIN/JSONS'
    img_dir = "DATASET_TRAIN/dataset_images_to_train"
    msk_dir = "DATASET_TRAIN/dataset_masks_to_train"
    overlay_dir = "DATASET_TRAIN/overlay"

    if not os.path.exists(msk_dir):
        os.makedirs(msk_dir)

    if not os.path.exists(overlay_dir):
        os.makedirs(overlay_dir)

    # Loop through all
           # Loop through all files in the masks folder
    for filename in os.listdir(json_folder):
        if filename.endswith('.json'):  # Check if the file is a JSON file
            filepath = os.path.join(json_folder, filename)
            with open(filepath, 'r') as f:
                data = json.load(f)
                work_on(data)

I suspect the problem might be that sometimes the annotation lines can be cutting across pixels.

Please trim your code to make it easier to find your problem. Follow these guidelines to create a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). — Community, May 11 '23 at 23:29

score 1 · Answer 1 · answered May 11 '23 at 22:25

In your script, you're using cv2.fillPoly to draw filled polygons on the masks. This function uses anti-aliasing by default, which causes the borders of the shapes to be smoothed. This smoothing is causing the "blurring" you're seeing, as pixels on the edge of the mask take on a range of values between the mask color and the background color.

To prevent anti-aliasing and make sure the mask only contains the exact colors you've specified, you can use cv2.drawContours instead. This function allows you to specify a line thickness, and if you set the line thickness to -1, the function will draw a filled polygon.

Here's how you can modify your script to use cv2.drawContours:

Replace this line:

cv2.fillPoly(binary_masks[class_type], [contours], 255)

With these lines:

contours = np.array(contours, dtype=np.int32) cv2.drawContours(binary_masks[class_type], [contours], -1, 255, thickness=-1)

In the line where you define contours, you're converting the list of points to a NumPy array of int32 type. This is because cv2.drawContours expects the contours to be in this format.

Note: This solution will make the edges of your masks sharp rather than smooth. If you're training a model to detect objects in images, having smooth edges in your training masks might be beneficial, as objects in real-world images are rarely perfectly sharp. However, this depends on your specific use case, and if you want the masks to contain only the exact colors you've specified, disabling anti-aliasing is the way to go.

This looks like ChatGPT output. – tchrist Jun 26 '23 at 04:01 — tchrist, Jun 26 '23 at 04:01

How to write a good masks script for semantic segmentation?

1 Answers1