I am working with MS-COCO dataset and I want to extract bounding boxes as well as labels for the images corresponding to backpack (category ID: 27) and laptop (category ID: 73) categories, and store them into different text files to train a neural network based model later.
I have already extracted the images corresponding to the aforementioned two categories and have created empty annotation files in a separate folder wherein I am looking to store the annotations along with labels (format of the annotation file is something like: label x y w h where w and h indicate width and height of the detected category). I built upon COCO-API (coco.py to be precise) to extract the images and create the empty text annotation files.
Following is the main function I wrote on top of coco.py
to do so:
if __name__ == "__main__":
littleCo = COCO('/home/r.bohare/coco_data/annotations/instances_train2014.json')
#id_laptop = littleCo.getCatIds('laptop')
"""Extracting image ids corresponding to backpack and laptop images."""
bag_img_ids = littleCo.getImgIds(catIds=[27])
laptop_img_ids = littleCo.getImgIds(catIds=[73])
#print "IDs of bag images:", bag_img_ids
#print "IDs of laptop imgs:", laptop_img_ids
"""Extracting annotation ids corresponding to backpack and laptop images."""
bag_ann_ids = littleCo.getAnnIds(catIds=[27])
laptop_ann_ids = littleCo.getAnnIds(catIds=[73])
#print "Annotation IDs of bags:", bag_ann_ids
#print "Annotation IDs of laptops:", laptop_ann_ids
"""Extracting image names corresponding to bag and laptop categories."""
bag_imgs = littleCo.loadImgs(ids=bag_img_ids)
laptop_imgs = littleCo.loadImgs(ids=laptop_img_ids)
#print "Bag images:", bag_imgs
#print "Laptop images:", laptop_imgs
bag_img_names = [image['file_name'] for image in bag_imgs]
laptop_img_names = [image['file_name'] for image in laptop_imgs]
print "Bag Images:", len(bag_img_names), bag_img_names[:5]
print "Laptop Images:", len(laptop_img_names), laptop_img_names[:5]
"""Extracting annotations corresponding to bag and laptop images."""
bag_ann = littleCo.loadAnns(ids=bag_ann_ids)
laptop_ann = littleCo.loadAnns(ids=laptop_ann_ids)
bag_bbox = [ann['bbox'] for ann in bag_ann]
laptop_bbox = [ann['bbox'] for ann in laptop_ann]
print "Bags' bounding boxes:", len(bag_ann), bag_bbox[:5]
print "Laptops' bounding boxes:", len(laptop_bbox), laptop_bbox[:5]
"""Saving files corresponding to bags and laptop category in a directory."""
import shutil
#path_to_imgs = "/export/work/Data Pool/coco_data/train2014/"
#path_to_subset_imgs = "/export/work/Data Pool/coco_subset_data/"
path_to_ann = "/export/work/Data Pool/coco_subset_data/annotations/"
dirs_list = [("/export/work/Data Pool/coco_data/train2014/", "/export/work/Data Pool/coco_subset_data/")]
for source_folder, destination_folder in dirs_list:
for img in bag_img_names:
shutil.copy(source_folder + img, destination_folder + img)
print "Bag images copied!"
for img in laptop_img_names:
shutil.copy(source_folder + img, destination_folder + img)
print "Laptop images copied!"
"""Creating empty files for annotation."""
for f in os.listdir("/export/work/Data Pool/coco_subset_data/images/"):
if f.endswith('.jpg'):
open(os.path.join(path_to_ann, f.replace('.jpg', '.txt')), 'w+').close()
print "Done creating empty annotation files."
I provided only the main function here as the rest of the code is coco.py file in COCO-API.
I debugged the code to find that there are different data structures:
cats
, a dictionary which maps category IDs to their supercategories and category names (labels).imgToAnns
, also a dictionary which maps every image ID to its segmentation ground truth, bounding box ground truth, category ID etc. From what I have managed to know so far, I think I need to use this dictionary to somehow map the image names I have in bag_img_names and laptop_img_names lists to their labels and bounding boxes, but I am not able to think in the right direction, as to how to access this dictionary (No method in coco.py returns it directly).imgs
, another dictionary which gives meta information about all images, such as, image name, image url, captured date etc.
Finally, I know this is an extremely specific question. Feel free to let me know if this belongs to a community other than stackoverflow (stats.stackexchange.com for example), and I will remove it. Also, it might be possible that I missed some vital information. I will provide it if I can think of it, or if someone asks.
I am only a beginner in Python, so please forgive me if I might have missed something obvious.
Any help whatsoever is highly appreciated. Thank You.