Extracting bounding boxes and category labels in MS-COCO dataset

Question

I am working with MS-COCO dataset and I want to extract bounding boxes as well as labels for the images corresponding to backpack (category ID: 27) and laptop (category ID: 73) categories, and store them into different text files to train a neural network based model later.

I have already extracted the images corresponding to the aforementioned two categories and have created empty annotation files in a separate folder wherein I am looking to store the annotations along with labels (format of the annotation file is something like: label x y w h where w and h indicate width and height of the detected category). I built upon COCO-API (coco.py to be precise) to extract the images and create the empty text annotation files.

Following is the main function I wrote on top of coco.py to do so:

if __name__ == "__main__":
    littleCo = COCO('/home/r.bohare/coco_data/annotations/instances_train2014.json')
    #id_laptop = littleCo.getCatIds('laptop')

"""Extracting image ids corresponding to backpack and laptop images."""
    bag_img_ids = littleCo.getImgIds(catIds=[27])
    laptop_img_ids = littleCo.getImgIds(catIds=[73])
    #print "IDs of bag images:", bag_img_ids
    #print "IDs of laptop imgs:", laptop_img_ids

"""Extracting annotation ids corresponding to backpack and laptop images."""
    bag_ann_ids = littleCo.getAnnIds(catIds=[27])
    laptop_ann_ids = littleCo.getAnnIds(catIds=[73])
    #print "Annotation IDs of bags:", bag_ann_ids
    #print "Annotation IDs of laptops:", laptop_ann_ids

"""Extracting image names corresponding to bag and laptop categories."""
    bag_imgs = littleCo.loadImgs(ids=bag_img_ids)
    laptop_imgs = littleCo.loadImgs(ids=laptop_img_ids)
    #print "Bag images:", bag_imgs
    #print "Laptop images:", laptop_imgs

    bag_img_names = [image['file_name'] for image in bag_imgs]
    laptop_img_names = [image['file_name'] for image in laptop_imgs]
    print "Bag Images:", len(bag_img_names), bag_img_names[:5]
    print "Laptop Images:", len(laptop_img_names), laptop_img_names[:5]

"""Extracting annotations corresponding to bag and laptop images."""
    bag_ann = littleCo.loadAnns(ids=bag_ann_ids)
    laptop_ann = littleCo.loadAnns(ids=laptop_ann_ids)
    bag_bbox = [ann['bbox'] for ann in bag_ann]
    laptop_bbox = [ann['bbox'] for ann in laptop_ann]
    print "Bags' bounding boxes:", len(bag_ann), bag_bbox[:5]
    print "Laptops' bounding boxes:", len(laptop_bbox), laptop_bbox[:5]

"""Saving files corresponding to bags and laptop category in a directory."""
    import shutil
    #path_to_imgs = "/export/work/Data Pool/coco_data/train2014/"
    #path_to_subset_imgs = "/export/work/Data Pool/coco_subset_data/"
    path_to_ann = "/export/work/Data Pool/coco_subset_data/annotations/"
    dirs_list = [("/export/work/Data Pool/coco_data/train2014/", "/export/work/Data Pool/coco_subset_data/")]

    for source_folder, destination_folder in dirs_list:
        for img in bag_img_names:
            shutil.copy(source_folder + img, destination_folder + img)
        print "Bag images copied!"
        for img in laptop_img_names:
            shutil.copy(source_folder + img, destination_folder + img)
        print "Laptop images copied!" 

"""Creating empty files for annotation."""
    for f in os.listdir("/export/work/Data Pool/coco_subset_data/images/"):
        if f.endswith('.jpg'):
            open(os.path.join(path_to_ann, f.replace('.jpg', '.txt')), 'w+').close()
    print "Done creating empty annotation files."

I provided only the main function here as the rest of the code is coco.py file in COCO-API.

I debugged the code to find that there are different data structures:

cats, a dictionary which maps category IDs to their supercategories and category names (labels).
imgToAnns, also a dictionary which maps every image ID to its segmentation ground truth, bounding box ground truth, category ID etc. From what I have managed to know so far, I think I need to use this dictionary to somehow map the image names I have in bag_img_names and laptop_img_names lists to their labels and bounding boxes, but I am not able to think in the right direction, as to how to access this dictionary (No method in coco.py returns it directly).
imgs, another dictionary which gives meta information about all images, such as, image name, image url, captured date etc.

Finally, I know this is an extremely specific question. Feel free to let me know if this belongs to a community other than stackoverflow (stats.stackexchange.com for example), and I will remove it. Also, it might be possible that I missed some vital information. I will provide it if I can think of it, or if someone asks.

I am only a beginner in Python, so please forgive me if I might have missed something obvious.

Any help whatsoever is highly appreciated. Thank You.

score 0 · Answer 1 · answered Apr 12 '19 at 20:50

0

2 years have passed. Now coco.py can already do what you were doing, by adding at the end some functions to map the annotations, converted into RLE format, to the images. take a look at this cocoapi.

answered Apr 12 '19 at 20:50

Pimpwhippa

37
8

Extracting bounding boxes and category labels in MS-COCO dataset

1 Answers1