0

I am working on MIAS dataset and for preprocessing I tried to remove the labels that appear on some several images. My approach does work on some images and not on others, can't figure out why... I am using binary threshold and then keep the biggest contour using the mask created:

x = 4
y = 4
for i in range(1, 5):
    fig = plt.figure(figsize = (25, 15))
    
    img = cv2.imread(img_add[i], 0) #img_add is an array of image addresses
    
    ax = plt.subplot(x, y, (i - 1) * 4 + 1)
    plt.title("orig")
    plt.imshow(img, cmap="gray")
    
    ret, thresh1 = cv2.threshold(img, 8, 255, cv2.THRESH_BINARY)
    
    ax = plt.subplot(x, y, (i - 1) * 4 + 2)
    plt.title("thresh")
    plt.imshow(thresh1, cmap="gray")
    
    im, contours, hierarchy = cv2.findContours(thresh1,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE )
    maxContour = 0
    for contour in contours:
        contourSize = cv2.contourArea(contour)
        if contourSize > maxContour:
            maxContour = contourSize
            maxContourData = contour
    mask = np.zeros_like(thresh1)
    cv2.fillPoly(mask,[maxContourData],1)
    
    ax = plt.subplot(x, y, (i - 1) * 4 + 3)
    plt.title("mask")
    plt.imshow(mask, cmap="gray")
    
    img = cv2.bitwise_and(img, img, mask = mask)
    
    ax = plt.subplot(x, y, (i - 1) * 4 + 4)
    plt.title("final")
    plt.imshow(img, cmap="gray")

It worked on images 1 & 3 but not on 2 & 4

Can someone please tell me what I can do to make this better and work on all images as intended? Seeing the binarised images I feel like the contouring should work properly but I guess it isn't able to identify the label and the breast as separate contours on some images even though they are not connected.

  • 1
    It's quite possible that your thresholding is creating a foreground (white) border at the top and hence your label is coalesced to the largest contour. I would suggest you to use adaptiveThreshold instead of simple threshold and drawContours instead of fillPoly. Of course the later would not change your results since your mask is improperly generated already. If you can post scaled down original images (say 500x500) it would be easier for those trying to help you out. – Knight Forked May 19 '21 at 18:19
  • In general it would be a good idea to blur image a bit by running Gaussian filter before thresholding to get rid of any such noise as I suspect in my previous comment. – Knight Forked May 19 '21 at 18:21
  • Always view your thresholded image to see what it is doing. I suspect your threshold level is good for two images but not for the other two images. – fmw42 May 19 '21 at 18:47
  • I would agree with `@Knight Forked`. It is possible that your label and breast are connected at the top of the image. I would suggest checking the thresholded image very carefully at the borders of the image to be sure you do not have a white region along the edges. If you are permitted, remove a border of 1 pixel or whatever is needed all around and replace with black before thresholding. – fmw42 May 19 '21 at 18:51
  • @KnightForked Thank you so much! I cropped 1 pixel off the top of each image and now it works properly! – sarthak behki May 19 '21 at 20:04
  • @fmw42 Tried a few more threshold values and found a better fit too. Thanks for pointing that out! – sarthak behki May 19 '21 at 20:05

0 Answers0