I'm currently working on my first assignment in image processing (using OpenCV in Python). My assignment is to calculate a precise score (to tenths of a point) of one to several shooting holes in an image uploaded by a user. One of the requirements is to transform the uploaded shooting target image to be from "birds-eye view" for further processing. For that I have decided that I need to find center coordinates of numbers (7 & 8) to select them as my 4 quadrilateral.
Unfortunately, there are several limitations that need to be taken into account.
Limitations:
- resolution of the processed shooting target image can vary
- the image can be taken in different lighting conditions
- the image processed by this part of my algorithm will always be taken under an angle (extreme angles will be automatically rejected)
- the image can be slightly rotated (+/- 10 degrees)
- the shooting target can be just a part of the image
- the image can be only of the center black part of the target, meaning the user doesn't have to take a photo of the whole shooting target (but there always has to be the center black part on it)
- this algorithm can take a maximum of 2000ms runtime
What I have tried so far:
- Template matching
- here I quickly realized that it was unusable since the numbers could be slightly rotated and a different scale
- Feature matching
- I have tried all of the different feature matching types (SIFT, SURF, ORB...)
- unfortunately, the numbers do not have that specific set of features so they matched a quite lot of false positives, but I could possibly filter them by adding shape matching, etc..
- the biggest blocker was runtime, the runtime of only a single number feature matching took around 5000ms (even after optimizations) (on MacBook PRO 2017)
- Optical character recognition
- I mostly tried using pytesseract library
- even after thresholding the image to inverted binary (so the text of numbers 7 and 8 is black and the background white) it failed to recognize them
- I also tried several ways of preprocessing the image and I played a lot with the tesseract config parameter but it didn't seem to help whatsoever
- Contour detection
- I have easily detected all of the wanted numbers (7 & 8) as single contours but failed to filter out all of the false positives (since the image can be in different resolutions and also there are two types of targets with different sizes of the numbers I couldn't simply threshold the contour by its width, height or area)
- After I would detect the numbers as contours I wanted to extract them as some ROI and then I would use OCR on them (but since there were so many false positives this would take a lot of time)
- I also tried filtering them by using cv2.matchShapes function on both contours and cropped template / ROI but it seemed really unreliable
Example processed images:
As of right now, I'm lost on how to progress about this. I have tried everything I could think of. I would be immensely happy if any of you image recognition experts gave me any kind of advice or even better a usable code example to help me solve my problem.
Thank you all in advance.