I am new to the image recognition stuff, so please be gentle :-). At the moment I'm looking to match a list of png files to screenshots: [Right click any image + open in new tab to see that bigger version]
Screenshots (image shape: 370x370x3):
Match against: (There are 4 different ones with each coming in 2 variants; a normal and a shiny one) (image shape: 256x256x4)
From the metadata, I already could boil down the possible matches (fyi: the files start with "pokemon_icon_585*").
What I've been trying to figure out is how to 'see' which one it is. (ie: which has the highest likeliness). What I've already been up to:
- the 'center' of the image is x / 2, y * (2/3). So that also makes it easier to detect more relevant points. I've been trying to figure out how to do a check on color with a weight based on the distance from that center. Not much luck there -- likely because of my limited knowledge.
- feature detection via OpenCV (ORB_create) + matching (FBMatcher) (the found matches don't match up properly -- meaning that for example an ear is matched with a foot, but the top matches themselves are all located on the animal, so that might be usable):

- I've been trying to figure out how to apply a mask (taken by the png and applied via transformation onto the screengrab). This would make me able to remove the background so that will not interfere with the detection.
- And next step would be to compare overal features/colors.
So in the above list I'm stuck trying to find information on 3. What can I search for? What libraries/methods should I use? And what for step 4? How would you do this? As you can see from the screengrabs and the 'templates' to match against, they're fairly similar to a human. But they are turned around a bit. As for the matches: screengrab 1 & 2 should be matching template 7, and screenshot 3 & 4 should match template 5.
Or of course if my approach is wrong, please do tell me how you would approach it with some keywords! That'd be nice!
The current way that I'm going is to create a bitmask to remove the background:
# Read it
img1 = cv2.imread('kkgbgwwpwu04ji.png')
# Find the edges
edges = cv2.Canny(img1, 100, 200)
# Remove the horizontal line at the bottom:
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(edges.copy(), cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts, _ = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
img2 = edges.copy()
for c in cnts:
cv2.drawContours(img2, [c], -1, (0,0,0), 2)
final = np.vstack([edges[:x1], img2[x1:x2], edges[x2:]])
# Get a bitmask
kernel = np.ones((5, 5), np.uint8)
mask = cv2.morphologyEx(edges.copy(), cv2.MORPH_CLOSE, kernel, iterations=10)
But how to remove the arc on top? I know I could just cut off the picture, but some pokemons are very large, and as such would be cut off as well.
This is final:
And this is mask: