0

Please excuse me for not posting any code, as I don't think I have reached far enough to be relevant for my question.

I am working on a solution that need to identify the parts of a vehicle being pointed by the customer drawing and extract the Text and the part its referring to as shown in an example below. enter image description here

I am really new to ML or AI technologies as a result I was looking at using the Azure customvision.ai which allows me to train the model using a bunch of images and object identification and has a nice REST API's to work with. This is somewhat working as I am able to pass the image and it is able to identify the parts of the cars visible on that image.

However I am unable to understand how to how to identify that 9. BXCU12 is actually pointing to Bonnet.

Can someone please help me by pointing to any example or a suitable solution approach for me to solve this problem.

Kiran
  • 2,997
  • 6
  • 31
  • 62
  • 1
    Could you please elaborate on "However I am unable to understand how to how to identify that 9. BXCU12 is actually pointing to Bonnet. " ? – Ash May 19 '20 at 17:50
  • @Ash: There will be some random text on the image with an arrow pointing to a part of the car. I need to read the value of the text and the part to which the arrow is pointing to as you can see in the above image. I am able to identify the texts on the screen as well as parts of the car after training via object detection, but can't see how to link the text with the part of the car based on the arrow. – Kiran May 20 '20 at 05:40
  • Does anyone have any suggestion on how to proceed here? – Kiran May 29 '20 at 18:53

1 Answers1

0

If I understand correctly, you already can identify parts from your recognition network and also text, and the link between them is given by the arrows in the image that you don't know how to locate. So, the remaining problem here is detecting the arrows and their end-points.

I can think of two solutions right now:

1) Use template matching to identify your arrows. The problem in your case though (from your example image) seems to be that your arrow heads have the same scale but have different lenghts. So, I'd suggest just using the head of the arrow + a very short tail as your template. Then you can rotate this small template N times, obtain N templates and use something like what opencv provides in term of template matching.

2) Train a small convolutional neural network to recognize the arrows. You only want to recognize arrows, so it's rather easy to create a small dataset of rotated arrows of different scales and train the network on them. Note that you should probably be able to add this network as an additional, very shallow head to your recognition network (you'll need to refine jointly though), so the overhead would be minimal.

Hope that helps.

Ash
  • 4,611
  • 6
  • 27
  • 41