2

I am new to programming and have recently been trying to learn object detection using YOLOv5. I can perform the detection of a custom object, but I am struggling to record the frames in which the objects were detected.

My goal is to compare the frames detected by my model to the frames where the object is already annotated framewise. As in, I annotate the frames with my object for "X.mp4", using VIA 3.0(http://www.robots.ox.ac.uk/~vgg/software/via/demo/via_video_annotator.html) and when the same video is run through my model, it returns the frames with the objects, for me to compare.

Ideally, I would want my program to return the frames and the time (in mins and secs) of the video where objects were detected.

Apologies if my question is unclear and is very silly. As mentioned before, any help is appreciated.

Thanks

1 Answers1

0

When you annotated the images from the video, you probably extracted some frames from it. You should name those frames according to their timestamp/number-of-frames in the video so that it is simpler to retrieve next.

When doing inference, you can count the number of frames since the first one (or get the timestamp) and then read the corresponding annotated image.

For instance if you annotated video.mp4 which has 30 frames (for simplicity sake) and only annotated every 10th images, you should end up with those images: im_0.jpg, im_10.jpg, im_20.jpg as well as their corresponding annotation.

During inference you read the video and count frames:

frame_count = 0

while(cap.isOpened()):  # Or whatever you use to read the video
  ret, frame = cap.read()
  if ret == True:
    # Get the annotated image name
    frame_name = f"im_{frame_count}.jpg"

    # Increment the frame count
    frame_count += 1
  else:
    break

Louis Lac
  • 5,298
  • 1
  • 21
  • 36