How to "visualise" the images which go inside the model for training in Detectron 2?

Question

How can I see my inputs to Detectron 2: Faster RCNN Object detection model? I need to debug my model to see both the X (Image) and Y (bounding Box coordinates, classes, labels etc). I am using some custom augmentations and in this github comment it says

You can loop over the data loader with for data in data_loader and visualize them

But how do I do that? I can easily run a loop through my custom augmentations but what I want to know is that how do I see the inputs just before going to model's first layer?

I am using custom augmentations with mapper like:

def custom_mapper(dataset_dict, transform_list = None):
    
    if transform_list is None:
      transform_list = [T.RandomBrightness(0.8, 1.2),
                      T.RandomContrast(0.8, 1.2),
                      T.RandomSaturation(0.8, 1.2),
                      ]
                      
    dataset_dict = copy.deepcopy(dataset_dict)
    image = utils.read_image(dataset_dict["file_name"], format="BGR")
                      
    image, transforms = T.apply_transform_gens(transform_list, image)
    tensor_image = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))
    dataset_dict["image"] =  tensor_image

    annos = [
        utils.transform_instance_annotations(obj, transforms, image.shape[:2])
        for obj in dataset_dict.pop("annotations")
        if obj.get("iscrowd", 0) == 0
    ]
    instances = utils.annotations_to_instances(annos, image.shape[:2])
    dataset_dict["instances"] = utils.filter_empty_instances(instances)


    # visualizer = Visualizer(dataset_dict["image"].numpy().transpose(1,2,0).astype(np.uint8)[:, :, ::-1], scale=0.5)
    # out = visualizer.draw_dataset_dict(dataset_dict)
    # cv2.imwrite(str(np.random.rand())+".jpg", out.get_image()[:, :, ::-1], )

    return dataset_dict


class AugTrainer(DefaultTrainer): # Trainer with augmentations
    @classmethod
    def build_train_loader(cls, cfg):
        return build_detection_train_loader(cfg, mapper=custom_mapper)

I have tried using these 3 commented out lines (just before return dataset_dict) to save the image so that I can see the what do the images look like but how can I know whether they form the proper BB, Class and more importantly, these are the exact images which go inside the model and there isn't something else which is altering my input?

score 0 · Answer 1 · answered Jun 11 '23 at 07:10

Haven't tested but looks like there's one way to override the SimpleTrainer class. In the Class, detectron2.engine.train_loop.SimpleTrainer, there's a run_step function where the data goes inside model in line 307 - 311:

loss_dict = self.model(data)

So we can get random samples for the data here and save during training time based on some random logic.

But for this either you have to clone and modify the Detectron 2 code in your system OR Override the method. I'd test with 2nd approach in future.

How to "visualise" the images which go inside the model for training in Detectron 2?

1 Answers1