How to train a custom keypoint detector for drone pose estimation. Detectron2

Question

Because I couldn't find the answer elsewhere I decided to describe my issue here. I'm trying to create keypoints detector of the Eachine TrashCan Drone for estimating its pose. I followed some tutorials. The first one was with the TensorFlow ObjectDetectionAPI and because I couldn't find a solution with it I tried to use detectron2. Everything was fine until I needed to register my own dataset to retrain a model.

I am running the code on Google Colab and using coco-annotator for making annotations (https://github.com/jsbroks/coco-annotator/)

I do not think that I wrongly annotated my dataset, but who knows so I will show in the hyperlink below to present it a bit for you: Picture with annotations made by me

I used that code to registrate the data:

from detectron2.data.datasets import register_coco_instances
register_coco_instances("TrashCan_train", {}, "./TrashCan_train/mask_train.json", "./TrashCan_train")
register_coco_instances("TrashCan_test", {}, "./TrashCan_test/mask_test.json", "./TrashCan_test")

This one is not giving me an error but when I'm trying to begin the training process with that code:

from detectron2.engine import DefaultTrainer

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("TrashCan_train",)
cfg.DATASETS.TEST = ("TrashCan_test")
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = ("detectron2://COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/model_final_a6e10b.pkl")  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
cfg.SOLVER.MAX_ITER = 25    # 300 iterations seems good enough for this toy dataset; you may need to train longer for a practical dataset
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128   # faster, and good enough for this toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # liczba klas

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

I end up with this:

WARNING [07/14 14:36:52 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[07/14 14:36:52 d2.data.datasets.coco]: Loaded 5 images in COCO format from ./mask_train/mask_train.json

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-12-f4f5153c62a1> in <module>()
     14 
     15 os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
---> 16 trainer = DefaultTrainer(cfg)
     17 trainer.resume_or_load(resume=False)
     18 trainer.train()

7 frames

/usr/local/lib/python3.6/dist-packages/detectron2/data/datasets/coco.py in load_coco_json(json_file, image_root, dataset_name, extra_annotation_keys)
    183             obj["bbox_mode"] = BoxMode.XYWH_ABS
    184             if id_map:
--> 185                 obj["category_id"] = id_map[obj["category_id"]]
    186             objs.append(obj)
    187         record["annotations"] = objs

KeyError: 9

This is where you can download my files:

https://github.com/BrunoKryszkiewicz/EachineTrashcan-keypoints/blob/master/TrashCan_Datasets.zip

and where will you find my Colab notebook:

https://colab.research.google.com/drive/1AlLZxrR64irms9mm-RiZ4DNOs2CaAEZ9?usp=sharing

I created such a small dataset because first of all, I wanted to go through threw the training process. If I will manage to do that I will make my dataset bigger.

Now I'm not even sure if is it possible to retrain a human keypoint model for obtaining keypoints of a drone-like object or any other. I need to say that I'm pretty new on the topic so I am asking for your understanding. If you know any tutorials for creating a custom (non-human) keypoint detector I will be grateful for any information.

Best regards!

Vahagn Tumanyan · Answer 1 · 2020-07-15T15:06:00.653

Welcome to stackoverflow!

Right off the bat I can notice some differences between your json file and the way that COCO dataset is described here the official coco website. Some keys are superfluous such as "keypoint colors". However in my experience with detectron2 superfluous keys are ignored and do not pose a problem.

The actual error message you are getting is due to the fact that you have annotations that have category_id that detectron's mapping does not account for.

There is only 1 category in the "categories" part of your mask_test.json file, but you have 2 annotations one with category_id = 9 and another with category_id = 11.

What detectron2 does is, it counts the number of categories in the categories field of the json and if they aren't numbered 1 through n it generates it's own mapping in your case it transforms 11 (your present id) into 1 (both in annotations and categories fields), but has no idea what to do with the annotation that has a category 9. And this is causes the error message on the line 185 obj["category_id"] = id_map[obj["category_id"]] because there simply isn't a mapping from 9 to whatever.

One thing to note is that AS FAR AS I CAN TELL (would be nice to ask how to do this on the github page of detectron2) you cannot train keypoint detection with multiple "classes" of keypoints using detectron2

Anyway the solution to your problem is fairly simple, just have one category with id: 1 in the categories field, and write change category_id of everything in the annotations field to 1. And run your training

EDIT: you will also need to change a few other configs for in the cfg for this to work. Namely TEST.KEYPOINT_OKS_SIGMAS = sigmas_used_for_evaluation_per_keypoint MODEL.ROI_KEYPOINT_HEAD.NUM_KEYPOINTS = number_of_keypoints_in_your_category And what is more you need to have keypoint_flip_map, keypoint_names and keypoint_connection_rules in the metadata of your dataset. To set this you can use MetadataCatalog.get('name_of_your_set').set(obj) where obj will be your metadata described above

How could I find config for keypoint detection in detectron2? — Seunghyeon, Dec 07 '20 at 13:17
@Vahagn Tumanyan can you please explain how to calculate TEST.KEYPOINT_OKS_SIGMAS (list of floats) it's unclear for me? thank you — JammingThebBits, Jun 24 '21 at 21:06
@JammingTheBits The official MSCOCO dataset gives defaults for human-skeleton keypoints. These numbers are basically how close (in what epsilon radius) should each point of the template be to the ground truth position so we consider it a "Correct" point. They have calculated they're own through experimentation and trial and error. Basically an eye can't be too far away, but a point on the knee can have more of a leeway. There's no specific way to calculate them, you just have to adjust to your own problem. — Vahagn Tumanyan, Jul 01 '21 at 12:48

How to train a custom keypoint detector for drone pose estimation. Detectron2

1 Answers1