3

I finetuned pytorch torchvision model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True) on my own custom dataset.

I followed this guide https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html#torchvision-object-detection-finetuning-tutorial but only trained Faster RCNN, not Mask RCNN.

I successfully finished training with no error, and the model returned a dict containing predicted boxes, labels, and scores.

In the guide I followed, they show how to visualize masks predicted by the model trained. Is there a similar method to visualize bounding box? I'm having a lot of trouble figuring this out.

Thank you

adam l
  • 33
  • 1
  • 4

2 Answers2

1

The prediction from FasterRCNN is of the form:

>>> predictions = model([input_img_tensor])
[{'boxes': tensor([[419.6865, 170.0683, 536.0842, 493.7452],
          [159.0727, 180.3606, 298.8194, 434.4604],
          [439.7836, 222.6208, 452.0138, 271.8359],
          [444.3562, 224.4628, 456.1511, 265.5336],
          [437.7808, 226.5965, 446.2904, 271.2691]], grad_fn=<StackBackward>),
  'labels': tensor([ 1,  1, 32, 32, 32]),
  'scores': tensor([0.9997, 0.9996, 0.5827, 0.2102, 0.0943], grad_fn=<IndexBackward>)}]

where the predicted boxes are of [x1, y1, x2, y2] format, with values between 0 and H and 0 and W.

You can use OpenCV's rectangle function to overlay bounding boxes on image.

import cv2
img = cv2.imread('input_iamge.png', cv2.COLOR_BGR2RGB)

for i in range(len(predictions[0]['boxes'])):
    x1, x2, x3, x4 = map(int, predictions[0]['boxes'][i].tolist())
    print(x1, x2, x3, x4)
    image = cv2.rectangle(img, (x1, x2), (x3, x4), (255, 0, 0), 1)

cv2_imshow('img', image)
kHarshit
  • 11,362
  • 10
  • 52
  • 71
  • `NameError: name 'cv2_imshow' is not defined` for cv2 version 4.6.0 please elaborate on how to fix this bug for newer python/cv2 versions. Thank you. – 7shoe Dec 25 '22 at 22:40
  • 1
    @7shoe use `cv2.imshow()` instead. `cv2_imshow()` is available only on colab. – kHarshit Jan 03 '23 at 15:16
1

You can use FiftyOne to easily make your dataset and add predictions to it. You can then visualize them in an interactive App. This tutorial follows a similar workflow to what you were doing: https://voxel51.com/docs/fiftyone/tutorials/evaluate_detections.html

This would work a lot better than having to visualize each image individually.

enter image description here

Eric Hofesmann
  • 504
  • 2
  • 7
  • 1
    We have found FiftyOne a game-changer. It eliminates the need to code visualizations described in https://stackoverflow.com/a/60275224/11262633. Highly recommended. – mherzog Aug 17 '21 at 20:15