0

I am trying to detect oriented bounding boxes with faster rcnn for a long time but I couldn't make it to do so. I aim to detect objects in DOTA dataset. I was using built-in faster rcnn model in pytorch but realized that it does not support OBB. Then I found another library named detectron2 that is built on the pytorch framework. Built-in faster rcnn network in detectron2 is actually compatible with OBB but I could not make that model work with DOTA. Because I could not convert DOTA box annotations to (cx, cy, w, h, a). In DOTA, objects are annotated by coordinates of 4 corners which are (x1,y1,x2,y2,x3,y3,x4,y4).

I cant come up with a solution that converts these 4 coordinates to (cx, cy, w, h, a), where cx and cy are the center point of OBB and w, h and a are width, height and angle respectively.

Is there any suggestion?

Arda Tümay
  • 77
  • 1
  • 10

1 Answers1

1

If you have your boxes in an Nx8 tensor/array, you can conver them to (cx, cy, w, h, a) by doing (assuming first point is top left, second point is bottom left, then bottom right, then top right...):

def DOTA_2_OBB(boxes):
    #calculate the angle of the box using arctan (degrees here)
    angle = (torch.atan((boxes[:,7] - boxes[:,5])/(boxes[:,6] - boxes[:,4]))*180/np.pi).float()
    #centrepoint is the mean of adjacent points
    cx = boxes[:,[4,0]].mean(1)
    cy = boxes[:,[7,3]].mean(1)
    #calculate w and h based on the distance between adjacent points
    w = ((boxes[:,7] - boxes[:,1])**2+(boxes[:,6] - boxes[:,0])**2)**0.5
    h = ((boxes[:,1] - boxes[:,3])**2+(boxes[:,0] - boxes[:,2])**2)**0.5
    return torch.stack([cx,cy,w,h,angle]).T   

Then giving this a test...

In [40]: boxes = torch.tensor([[0,2,1,0,2,2,1,3],[4,12,8,2,12,12,8,22]]).float()    
    
In [43]: DOTA_2_OBB(boxes)                                                                                            
Out[43]: 
tensor([[  1.0000,   1.5000,   1.4142,   2.2361, -45.0000],
        [  8.0000,  12.0000,  10.7703,  10.7703, -68.1986]])
        
jhso
  • 3,103
  • 1
  • 5
  • 13
  • Thank you so much. It works. But the only thing I don't understand is the meaning of comma in [:,7] syntax. What is achieved by putting a comma in the torch tensor? – Arda Tümay Nov 17 '21 at 07:53
  • Actually, the solution is not that much hard but struggling too much on a problem can cause simple solutions to be ignored. A lesson to be learned for me... – Arda Tümay Nov 17 '21 at 08:05
  • Well, what if the starting point of the coordinates of the rectangle is not always started from the top left corner but from random corner and I always want to calculate the positive angle of the rectangle w.r.t x axis? How can you pick proper coordinates to calculate angle? – Arda Tümay Nov 17 '21 at 17:52
  • Since the boxes are of shape `[N,8]`, if you index your array by using `[:,[1,2]]` then you will get the second and third column for all `N` boxes, returning an array/tensor of shape `[N,2]`. If the box doesn't start at the top left then you just have to adjust the box indices. I assumed top left xy - (0,1), bottom left (2,3), bottom right (4,5), top right (6,7). Just switch these around if you need to match your box layout. – jhso Nov 17 '21 at 22:17