i am doing some OCR whit python, in order to get the coordinates of the letters in an image, i take the centroid of a region(returned by the regionprops from skimage.measure) and if a distance between one centroid vs the others centroids is less than some value, i drop that region, i though this would solve the problem of several regions one inside the others but i missed that if a region with less area is detected first(like just a part of a letter) all the bigger regions (that may contain the whole letter) are ignored, here is my code
centroids = []
for region in regionprops(label_image):
if len(centroids) == 0:
centroids.append(region.centroid[1])
do some stuff...
if len(centroids) != 0:
distances = []
for centroid in centroids:
distance = abs(centroid - region.centroid[1])
distances.append(distance)
if all(i >= 0.5 * region_width for i in distances):
do some stuff...
centroids.append(region.centroid[1])
now the questions here is if there is a way to order the list returned by regionprops by area? and how to do it?, or if you can give a better way to avoid the problem of a region inside another regions, thanks in advance