0

I'm trying to identify regions of various UI elements in a video game. I am trying to use OpenCV template matching to accomplish this. These UI elements often contain varying graphics/text/icons, so it's not as easy as matching a generic template. Therefore, my plan was to make templates with transparency - where I make non-static parts of the UI transparent so that they are ignored during the template matching. My script is working for most instances, except when there are many black pixels on screen.

I've considered both: How do I find an image on screen ignoring transparent pixels & How to template match a simple 2D shape in OpenCV?

Here is a sample screenshot of the game view.

base image

Here are some of the transparent templates I'm using to identify regions of UI elements:

minimap template inventory template chatbox template

All of these templates work with the following script - EXCEPT when there is an excessive amount of black pixels on screen:

''' mostly based on: https://stackoverflow.com/questions/71302061/how-do-i-find-an-image-on-screen-ignoring-transparent-pixels/71302306#71302306'''

import cv2
import time
import math

# read game image
img = cv2.imread('base.png')

# read image template
template = cv2.imread('minimap.png', cv2.IMREAD_UNCHANGED)
hh, ww = template.shape[:2]

# extract base image and alpha channel and make alpha 3 channels
base = template[:,:,0:3]
alpha = template[:,:,3]
alpha = cv2.merge([alpha,alpha,alpha])

# do masked template matching and save correlation image
correlation = cv2.matchTemplate(img, base, cv2.TM_CCORR_NORMED, mask=alpha)

# set threshold and get all matches
threshold = 0.90

''' from:  https://stackoverflow.com/questions/61779288/how-to-template-match-a-simple-2d-shape-in-opencv/61780200#61780200 '''
# search for max score
result = img.copy()
max_val = 1
rad = int(math.sqrt(hh*hh+ww*ww)/4)

# find max value of correlation image
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(correlation)
print(max_val, max_loc)

if max_val > threshold:
    # draw match on copy of input
    cv2.rectangle(result, max_loc, (max_loc[0]+ww, max_loc[1]+hh), (0,0,255), 1)

    # save results
    cv2.imwrite('match.jpg', result)
    cv2.imshow('result', result)
    cv2.waitKey(0)
else:
    print("No match found")

Here is the result of searching for the minimap when black pixels are present:

erroneous result

Here is the result when there are fewer black on screen:

correct result

Admittedly, I don't quite understand how the template matching algorithm is treating the alpha layer, so I'm not sure why the black pixels are interfering with the search. Can anyone explain?

ktom
  • 125
  • 8
  • I'm noticing the minimap is mostly black, when shown over landscape, but coloured in when shown over black? – Grismar Nov 27 '22 at 22:25
  • the templates you present *surely* are always in the same places? which of those are not? – Christoph Rackwitz Nov 27 '22 at 23:16
  • 3
    TM_CORR_NORMED does not work well in flat areas of color (black), since the denimator in the normed version is the standard deviation of the region. This will be 0. So a divide by zero. Try using TM_SQDIFF. But note that best match is now at 0. So you have to threshold on low values. – fmw42 Nov 27 '22 at 23:54
  • @Grismar That is just a coincidence. It is almost always coloured in. However, my template for the minimap should exclude the inner area of the minimap regardless (the area that changes constantly). – ktom Nov 28 '22 at 00:44
  • 1
    @ChristophRackwitz The game client is resizable, and the UI elements can theoretically be moved around, so template matching is required for what I'm doing. There is an easier alternative to this problem by template matching (without alpha) a small static portion of the UI region and then just calculating a bounding box based on that point (since we know the size of the minimap, for instance), but there many other ways I'd like to apply transparent template matching in this project and the black skybox is a nuisance. Plus, I'm curious about what the solution is. – ktom Nov 28 '22 at 00:49
  • Post a [mre]. It works fine with the second game screen image and a template image with 1 big and 5 small circles. – relent95 Nov 28 '22 at 03:51
  • 1
    Your call of the ```cv2.matchTemplate()``` does treat the alpha layer because it's given only RGB images. It's given a mask image, which you created from the alpha channel. Because its data type is 'uint8'(CV_8U), only pixels corresponding to nonzero pixels in masks are calculated. See [the reference](https://docs.opencv.org/4.x/df/dfb/group__imgproc__object.html#ga586ebfb0a7fb604b35a23d85391329be). – relent95 Nov 28 '22 at 04:00
  • @fmw42, the template image of the OP is not a flat area of black, so the division by zero is not relevant. – relent95 Nov 28 '22 at 04:08
  • 2
    @relent95 But the image has flat regions of black the size of the template or larger. So there will still be division by zero when the template hits those regions. The denominator has std of the template times std of the current region. See docs.opencv.org/4.1.1/df/dfb/…. Also fmwconcepts.com/imagemagick/ACCELERATED_TEMPLATE_MATCHING.pdf – fmw42 Nov 28 '22 at 05:11
  • @fmw42, that case is handled by ```matchTemplate()```. See [this](https://github.com/opencv/opencv/blob/4.6.0/modules/imgproc/src/templmatch.cpp#L1023) for example. – relent95 Nov 28 '22 at 07:18
  • @fmw42 Your solution works perfectly. Though, I'll admit I'm not too sure how I should be using the threshold value, as it seems to be finding the correct match regardless of the threshold value I use. I notice that the `min_val` produced by `cv2.minMaxLoc()` is anywhere between -1 and ~1.5 million, yet the resulting bounding box always seems to be correct... – ktom Nov 28 '22 at 16:58
  • 1
    TM_SQDIFF produces very large numbers (not in the range of 0 to 1 as in TM_SQDIFF_NORMED). But the latter (NORMED version is not usable with a mask. See the documentation. So your threshold will need to be rather large. You need to experiment a bit. Or compute the squared difference maximum over your template, i.e. 255*255*w*h of template. – fmw42 Nov 28 '22 at 17:01
  • @fmw42 Understood. Is there any way I should be using these large numbers or are they irrelevant? – ktom Nov 28 '22 at 17:02
  • 1
    You need to experiment a bit. Or compute the squared difference maximum over your template, i.e. 255*255*w*h of template. You can then program to just take a percentage of that value, so similar to using range 0 to 1. That way, you have a known range for any size template. – fmw42 Nov 28 '22 at 18:59
  • @fmw42 I see. Strangely enough, I'm having huge success using SQ_DIFF_NORMED with a mask, even though you mentioned a mask can't be used. – ktom Nov 30 '22 at 18:04
  • 1
    My version of the documentation at https://docs.opencv.org/4.1.1/df/dfb/group__imgproc__object.html#ga586ebfb0a7fb604b35a23d85391329be says for the mask argument: "Mask of searched template. It must have the same datatype and size with templ. It is not set by default. Currently, only the TM_SQDIFF and TM_CCORR_NORMED methods are supported." So it says it is limited to TM_SQDIFF and not the NORMED version. Perhaps it is different in your version or the docs are wrong – fmw42 Nov 30 '22 at 19:09
  • It looks like that restriction or at least mention of it has been removed in OpenCV 4.3 or higher. The version I was using above at the time was 4.1.1 which did say it was limited to those two methods. – fmw42 Nov 30 '22 at 19:23

0 Answers0