1

I have used a text detection model which gives the bounding box coordinates . I have converted the polygons to rectangles for cropping the text area in the image. The resulted bounding boxes are shuffled and i could not sort it out. As per my understanding, the boxes are sorted on the basis of Y3. But when there is a presence of curved text in a same line like in the image below, the order gets shuffled and i need to sort it before passing it to text extraction model.

Image with polygon coordinates

enter image description here

Converted the polygons to rectangles for cropping text areas

same image with converted rectangle bounding box

enter image description here

img_name='rre7'
orig=cv2.imread('CRAFT-pytorch/test/'+str(img_name)+'.jpg')
colnames=['x1','y1','x2','y2','x3','y3','x4','y4']
df=pd.read_csv('result/res_'+str(img_name)+'.txt',header=None, 
delimiter=',', names=colnames)
rect=[]
boxes=df.values
for i,(x1,y1,x2,y2,x3,y3,x4,y4) in enumerate(boxes):
    startX = min([x1,x2,x3,x4])
    startY = min([y1,y2,y3,y4])
    endX = max([x1,x2,x3,x4])
    endY = max([y1,y2,y3,y4])
    #print([startX,startY,endX,endY])
    rect.append([startX,startY,endX,endY])
rect.sort(key=lambda b: b[1])
print("After sorting")
print('\n')
# initially the line bottom is set to be the bottom of the first rect
line_bottom = rect[0][1]+rect[0][3]-1
line_begin_idx = 0
for i in range(len(rect)):
    # when a new box's top is below current line's bottom
    # it's a new line
    if rect[i][1] > line_bottom:
    # sort the previous line by their x
        rect[line_begin_idx:i] = sorted(rect[line_begin_idx:i], key=lambda 
        b: b[0])
        line_begin_idx = i
    # regardless if it's a new line or not
    # always update the line bottom
    line_bottom = max(rect[i][1]+rect[i][3]-1, line_bottom)
# sort the last line
rect[line_begin_idx:] = sorted(rect[line_begin_idx:], key=lambda b: b[0])
for i,(startX,startY, endX,endY) in enumerate(rect):
    roi = orig[startY:endY, startX:endX]   
    cv2.imwrite('gray/'+str(img_name)+'_'+str(i+1)+'.jpg',roi)

In this case the polygon bounding box coordinates with detected text are

146,36,354,34,354,82,146,84 "Australian"

273,78,434,151,411,201,250,129 "Collection"

146,97,250,97,250,150,146,150 "vine"

77,166,131,126,154,158,99,197 "Old"

242,215,361,241,354,273,235,248 "Valley"

140,247,224,219,234,250,150,277 "Eden"

194,298,306,296,307,324,194,325 "Shiraz"

232,406,363,402,364,421,233,426 "Vintage"

152,402,216,405,215,425,151,422 "2008"

124,470,209,480,207,500,122,490 "South"

227,481,387,472,389,494,228,503 "Australia"

222,562,312,564,311,585,222,583 "Gibson"

198,564,217,564,217,584,198,584 "by"

386,570,421,570,421,600,386,600 "750 ml"

But the expected output is that i need the coordinates sorted in the following order of text appearance....Australian->old->vine->collection->Eden->Valley->shiraz->2008->vintage->south->Australia->by->GIBSON->750ml.

0 Answers0