I am working on a Sign Language-to-text converter. I am currently referring following repo for my project
https://github.com/nicknochnack/RealTimeObjectDetection
For the same, I want to create XML files for my images to train my ssd_model to detect the signs in real time. For XML file generation, I used python's labelImg
(https://github.com/heartexlabs/labelImg) tool, which creates the XML files as per the bounding box dimensions. But as it is a manual process, labeling thousands of images from different classes is very tedious and unprofessional.
While doing research, I got a blog as well as an attached repo to automate this process. Below are the links for the same
https://github.com/AlvaroCavalcante/auto_annotate
While referring to the repo, it has provided a script, running which it automatically generates XML files for the images in the folder. After running it on my image folder, I came across the following issue.
Now, this is what is happening to me. On running the script it creates the bounding box for these 2 almost identical images but the boxes are having different dimensions and also it is detecting the label in image 1 with the wrong dimensions but actually, it should detect the label in image 2 and also it should draw the dimensions as in image 2 for all images inside the folder but it is not making it. So I want to know if I can define the dimensions for the box beforehand.
So my question is, Is there any way I can set the predefined dimensions of the bounding box for my images?
Also, if you wish, you can check the repo and there also you would find the same issue. It would be great if anyone could help me with this question.
Also what model can I actually use for such custom images as from 1k images from the folder it only detects around 12-15 with labels and creates XML files for the same? Currently, I am using the ssd_mobilenet_320x320_coco2017 model file for this.