0

I have an image dataset and before feeds it to deep learning algorithm I need to crop it to the same size. All images have different size of black margins as the below image demonstrates. Any suggestions for a way to crop images with different margin size. enter image description here

enter image description hereenter image description here

Sarah K
  • 63
  • 10

4 Answers4

1

First, do a thresholding with a low-intensity threshold value (if your background is definitely completely black, you could even threshold at an intensity of 1) to determine all non-border components.

Next, use Connected-component labeling to determine all isolated foreground components. The central scan-image you are interested in should then always result in the biggest component. Crop out this biggest component to remove the border together with all possible non-black artifacts (labels, letters etc.). You should be left with only the borderless scan.

You can find all the algorithms needed in any basic image processing library. I'd personally recommend looking into OpenCV, they also include phyton bindings.

T A
  • 1,677
  • 4
  • 21
  • 29
1

One way to this could be as follows:

  • Flood-fill the image with red starting at the top-left corner, and allowing around 5% divergence from the black pixel there.

enter image description here

  • Now make everything that is not red into white - because the next step after this looks for white pixels.

enter image description here

  • Now use findContours() (which looks for white objects) and choose the largest white contour as your image and crop to that.

You could consider making things more robust by considering some of the following ideas:

  • You could normalise a copy of the image to the full range of black to white first in case you get any with near-black borders.

  • You could check that more than one, or all corner pixels are actually black in case you get images without a border.

  • You could also flag up issues if your cropped image appears to be less than, say 70%, of the total image area.

  • You could consider a morphological opening with 9x9 square structuring element as the penultimate step to tidy things up before findContrours().

enter image description here

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
1

Since your border color is black (nearly perfect black) and will be the same in all the images, I would suggest applying binary threshold making everything white (255) except the black region. Now some of the image regions may get affected too but that's not a problem.

Now find contours in the image and second largest contour will be your region. Calculate rectangular bounding box for this contour and crop the same region in the original image.

0

here is the solution code for this question:

import warnings
warnings.filterwarnings('always')
warnings.filterwarnings('ignore')
import cv2
import numpy as np
import os

path = "data/benign/"
img_resized_dir = "data/pre-processed/benign/"
dirs = os.listdir(path)

def thyroid_scale():
    for item in dirs:
    if os.path.isfile(path+item):
        img = cv2.imread(path+item)
        gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
        ret,thresh = cv2.threshold(gray,0,255,0)
        im2, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

        areas = [cv2.contourArea(c) for c in contours]
        max_index = np.argmax(areas)
        cnt=contours[max_index]
        x,y,w,h = cv2.boundingRect(cnt)
        crop_img = img[y+35:y+h-5,x+25:x+w-10]
        resize_img = cv2.resize(crop_img, (300, 250), interpolation = cv2.INTER_CUBIC)
        cv2.imwrite(img_resized_dir+item, resize_img)

thyroid_scale()
Sarah K
  • 63
  • 10