0

I am using the VisDrone dataset to train MobileNet-YOLOV3. The dataset contains images with different size i.e. 960 x 540 P, 1920 x 1080 P etc and an annotation file for each image. But when I train the YOLO model it resizes all the images to 416 x 416 P which leads to missing some small objects during training and testing. It also has a problem that when it is resized the annotation may be wrong as bounding box must be different for resized images.

So my question is How to resize or crop these images and also related annotations at the same time? I have both .txt and .xml annotation files for each image.

Another solution is to crop and convert each image to new 2 to 4 images and create new annotations accordingly from old annotations. It is possible to crop 4 images from one image but is it possible to convert that one original annotation file to new 4 annotations files according to the cropped areas?

Asad Javed
  • 31
  • 1
  • 6

1 Answers1

0

I had the same issue to resize Pascal VOC datasets. I used this git repo: https://github.com/italojs/resize_dataset_pascalvoc and it worked fine.

There is another Python library: https://pypi.org/project/pascal-voc-tools/ which is more than resizing and you can do different image manipulation and update the annotation files.