Identifying similar items in Python to reduce / clarify number of CV results

Question

After running two different CV-algorithms over an image (to find multiple occurences of a detail) they both deliver a list of results like the list below.

First algorithm delivers (tuples with left, top, width height):

[(816, 409, 35, 39), (817, 409, 35, 39), (818, 409, 35, 39), (815, 410, 35, 39), (816, 410, 35, 39), (817, 410, 35, 39), (818, 410, 35, 39), (819, 410, 35, 39), (816, 411, 35, 39), (817, 411, 35, 39), (818, 411, 35, 39), (816, 447, 35, 39), (817, 447, 35, 39), (818, 447, 35, 39), (815, 448, 35, 39), (816, 448, 35, 39), (817, 448, 35, 39), (818, 448, 35, 39), (816, 449, 35, 39), (817, 449, 35, 39), (818, 449, 35, 39), (856, 639, 35, 39), (857, 639, 35, 39), (858, 639, 35, 39), (855, 640, 35, 39), (856, 640, 35, 39), (857, 640, 35, 39), (858, 640, 35, 39), (859, 640, 35, 39), (856, 641, 35, 39), (857, 641, 35, 39), (858, 641, 35, 39)]

Output of second (CV2) algorithm is (coordinates of upper left corner):

[(816, 409), (817, 409), (818, 409), (815, 410), (816, 410), (817, 410), (818, 410), (819, 410), (816, 411), (817, 411), (818, 411), (816, 447), (817, 447), (818, 447), (815, 448), (816, 448), (817, 448), (818, 448), (819, 448), (816, 449), (817, 449), (818, 449), (856, 639), (857, 639), (858, 639), (855, 640), (856, 640), (857, 640), (858, 640), (859, 640), (856, 641), (857, 641), (858, 641)]

But on the screen there exist only three occurrences of the searched item. Looking closely you can see that - for example - the first two entries are very similar (816 instead of 817 on the left-position).

The CV2 code looks like this:

# detect image in image 
img_rgb = open_cv_image  # original image
template = cv2.imread('C:/temp/detail.png') # searching this!
w, h = template.shape[:-1]

res = cv2.matchTemplate(img_rgb, template, cv2.TM_CCOEFF_NORMED)
threshold = .8
loc = np.where(res >= threshold)
a = []
for pt in zip(*loc[::-1]):  
    cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
    a.append(pt)
#cv2.imwrite('result.png', img_rgb)
print(a)

So both approaches do not deliver EXACT results, but a diffuse list of similar results. My questions to be solved are:

1. How to find out HOW MANY items are really found (grouping the results)?

2. How to reduce the list to one item in each group (doesn't matter which one as there are all similar)?

Is there an easy way to group similar tuples / lists in Python and reduce it to the essential items? Or is there any simple CV-Mechanism for Python that gives exact matches? Any help is appreciated...

Thanks in advance! Ulrich

could you please reduce your question to the most important things only. give an input/output example and state your code. — Aru, May 28 '21 at 14:34
This sounds like a clustering problem, more than a pure-coding problem. It might be more appropriate on the Data Science stack exchange: https://datascience.stackexchange.com/ As a first-pass attempt to help you get started on clustering, look into k-means clustering: https://www.analyticsvidhya.com/blog/2020/12/a-detailed-introduction-to-k-means-clustering-in-python/ — Sarah Messer, May 28 '21 at 14:47
Nobody with the problem to have many overlapping results when working with Computer Vision? Could overlapping rectangles find a solution? — Ulrich, May 28 '21 at 17:45

score 3 · Accepted Answer · answered May 28 '21 at 14:44

3

You can have a list of every tuples with a euclidean_distance greater than 1 (or what else) from the previous one with a simple list comprehension. However you will need to insert a zero-values tuple at the beginning. If s is your input list then

s.insert(0, (0,0,0,0))
t = [s[x] for x in range(1,len(s)) if euclidean_distance(s[x],s[x-1]) > 1]

>>> t
[(816, 409, 35, 39), (815, 410, 35, 39), (816, 411, 35, 39), (816, 447, 35, 39), (815, 448, 35, 39), (816, 449, 35, 39), (856, 639, 35, 39), (855, 640, 35, 39), (856, 641, 35, 39)]

answered May 28 '21 at 14:44

gimix

3,431
2
5
21

Counting the peaks in the Euclidean Distance Curve seems to be a resonable approach! – Ulrich May 28 '21 at 17:44
I tried that, but in my case the distance is not constantly close to 1. In some cases is raises above 3 (for whatever reason) and the you have evaluate minor and major peaks in that curve which gets you back to the original data. – Ulrich May 30 '21 at 07:47

score 0 · Answer 2 · answered May 30 '21 at 07:42

Non local Maxima Supression (see: Non local maxima suppression in python and especially for computer vision: https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/) seems to be an established method for this problem. Downside: It eats a lot of ressources and results in lower speed.

I picked up the idea of overlapping rectangles with the assumptions:

Overlapping rectangles represent the same result
Touching rectangles OR only slightly overlapping rectangles do NOT represent the same result (threshold needed)
The data contains only true representations (no errors included / first item in data is one of the results)

With this you just need some geometry formulas to get a proper set of results. Here's the first draft of the code (it is a bit messy, but it works):


data = [(816, 409, 35, 39), (817, 409, 35, 39), (818, 409, 35, 39), 
        (815, 410, 35, 39), (816, 410, 35, 39), (817, 410, 35, 39), 
        (818, 410, 35, 39), (819, 410, 35, 39), (816, 411, 35, 39), 
        (817, 411, 35, 39), (818, 411, 35, 39), (816, 447, 35, 39), 
        (817, 447, 35, 39), (818, 447, 35, 39), (815, 448, 35, 39), 
        (816, 448, 35, 39), (817, 448, 35, 39), (818, 448, 35, 39), 
        (816, 449, 35, 39), (817, 449, 35, 39), (818, 449, 35, 39), 
        (856, 639, 35, 39), (857, 639, 35, 39), (858, 639, 35, 39), 
        (855, 640, 35, 39), (856, 640, 35, 39), (857, 640, 35, 39), 
        (858, 640, 35, 39), (859, 640, 35, 39), (856, 641, 35, 39), 
        (857, 641, 35, 39), (858, 641, 35, 39) ]

def Points(i):
    # converts (X, Y, W, H) to (X1, Y1, X2, Y2) 
    return (i[0],i[1],i[0]+i[2],i[1]+i[3])

def XYWH(i): 
    # converts (X1, Y1, X2, Y2) to (X, Y, W, H)
    return (i[0],i[1],i[2]-i[0],i[3]-i[1])

def Overlap(R1, R2):
    # check if two rectangles overlap
    R1 = Points(R1)
    R2 = Points(R2)
    if (R1[0]>=R2[2]) or (R1[2]<=R2[0]) or \
       (R1[3]<=R2[1]) or (R1[1]>=R2[3]): 
        return False
    return True

def Intersection(a,b):
    # return the intersection area
    a = Points(a)
    b = Points(b)
    x1 = max(min(a[0], a[2]), min(b[0], b[2]))
    y1 = max(min(a[1], a[3]), min(b[1], b[3]))
    x2 = min(max(a[0], a[2]), max(b[0], b[2]))
    y2 = min(max(a[1], a[3]), max(b[1], b[3]))
    if x1 < x2 and y1 < y2:
        return XYWH((x1, y1, x2, y2))

def Area(i):
    # calculate the size of a rectangle
    return i[2]*i[3]

def Covers(a, b):
    # calculates the share (0 to 1) of coverage
    if not Overlap(a, b): return None
    inters = Area(Intersection(a, b))
    original = Area(a)
    return inters/original


def Uniques(Data, Threshold = 0.8):
    ret = [Data[0]]
    for i in range(len(Data)-1):
        c = Covers(data[i],data[i+1])
        if not c or c < Threshold: 
            ret.append(data[i+1])
    return ret 

print(Uniques(data))

You need to experiment with the threshold a bit to get good results. Include a print(c) in the for-loop inside the Uniques function to see the spread of it. The results of the code above shows three correct findings:

[(816, 409, 35, 39), (816, 447, 35, 39), (856, 639, 35, 39)]

Only downside is, that this is one possible solution per result and not the optimal one, but for most use-cases this should be sufficent.

Identifying similar items in Python to reduce / clarify number of CV results

2 Answers2