1

My goal is to draw the text bounding boxes for the following image. Since the two regions are colored differently, so this should be easy. I just need to select the pixels that match a certain color values to filter out the other text region and run a convex hull detection.

enter image description here

However, when I zoom in the image, I notice that the text regions has the zig-zag effect on the edges, so I'm not able to easily find the two color values (for the blue and green) from the above image.

I wonder is there a way to remove the zig-zag effect to make sure each phrase is colored consistently? Or is there a way to determine the dominant color for each text region?

enter image description here

Zekun
  • 375
  • 5
  • 12
  • A possible solution would be to threshold the image and use that as a mask to sample the colored pixels in the original image. However, the result might not be what you are looking for because the mask could miss some pixels due to the heavy anti-aliasing. Another solution involves [Color Quantization](https://docs.opencv.org/master/d1/d5c/tutorial_py_kmeans_opencv.html) via clustering - this essentially groups similar-colored pixels into clusters. These clustered pixels can then be re-drawn using one uniform, solid color, effectively aliasing the letters. – stateMachine May 22 '21 at 04:19
  • @stateMachine Yes I agree. The first solution might not generalize well when I have other images with differnt colors, as I need to choose appropriate threshold for each image. I think the second solution looks promissing. A combineation of both color and position might work for this case. – Zekun May 22 '21 at 05:08

1 Answers1

2

The anti-aliasing causes the color to become lighter (or darker if against a black background) so you can think of the color as being affected by light. In that case, we can use light-invariant color spaces to extract the colors.

So first convert to hsv since it is a light invariant colorspace. Since the background can be either black or white, we will filter out them out (if the bg is always white and the text can be black you would need to change the filtering to allow for that).

I took the saturation as less than 80 as that will encompass white black and gray since they are the only colors with low saturation. (your image is not perfectly white, its 238 instead of 255 maybe due to jpg compression)

Since we found all the black, white and gray, the rest of the image are our main colors, so i took the inverse mask of the filter, then to make the colors uniform and unaffected by light, set the Saturation and Value of the colors to 255, that way the only difference between all the colors will be the hue. I also set bg pixels to 0 to make it easier for finding contours but thats not necissary

After this you can use whatever method you want to get the different groups of colors, I just did a quick histogram for the hue values and got 3 peaks but 2 were close together so they can be bundled together as 1. You can maybe use peak finding to try to find the peaks. There might be better methods of finding the color groups but this is what i just thought of quickly.

hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
mask = hsv[:,:,1] < 80 # for white, gray & black
hsv[mask] = 0 # set bg pixels to 0
hsv[~mask,1:] = 255 # set fg pixels saturation and value to 255 for uniformity

colors = hsv[~mask]
z = np.bincount(colors[:,0])
print(z)

bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('bgr', bgr)

enter image description here

Ta946
  • 1,342
  • 12
  • 19
  • This looks great! Thanks so much for the detailed explanation. I'll upvote and accept your answer. – Zekun May 22 '21 at 20:39