2

I'm trying to use https://deno.land/x/opencv@v4.3.0-10 to get template matching to work in deno. I heavily based my code on the node example provided, but can't seem to work it out just yet.

By following the source code I first stumbled upon error: Uncaught (in promise) TypeError: Cannot convert "undefined" to int while calling cv.matFromImageData(imageSource).
After experimenting and searching I figured the function expects {data: Uint8ClampedArray, height: number, width: number}. This is based on this SO post and might be incorrect, hence posting it here.

The issue I'm currently faced with is that I don't seem to get proper matches from my template. Only when I set the threshold to 0.1 or lower, I get a match, but this is not correct { xStart: 0, yStart: 0, xEnd: 29, yEnd: 25 }.

I used the images provided by the templateMatching example here.
Haystack
haystack Needle
needle

Any input/thoughts on this are appreciated.

import { cv } from 'https://deno.land/x/opencv@v4.3.0-10/mod.ts';

export const match = (imagePath: string, templatePath: string) => {
    const imageSource = Deno.readFileSync(imagePath);
    const imageTemplate = Deno.readFileSync(templatePath);

    const src = cv.matFromImageData({ data: imageSource, width: 640, height: 640 });
    const templ = cv.matFromImageData({ data: imageTemplate, width: 29, height: 25 });

    const processedImage = new cv.Mat();
    const logResult = new cv.Mat();
    const mask = new cv.Mat();

    cv.matchTemplate(src, templ, processedImage, cv.TM_SQDIFF, mask);

    cv.log(processedImage, logResult)
    console.log(logResult.empty())
};

UPDATE

Using @ChristophRackwitz's answer and digging into opencv(js) docs, I managed to get close to my goal.

I decided to step down from taking multiple matches into account, and focused on a single (best) match of my needle in the haystack. Since ultimately this is my use-case anyways.

Going through the code provided in this example and comparing data with the data in my code, I figured something was off with the binary image data which I supplied to cv.matFromImageData. I solved this my properly decoding the png and passing that decoded image's bitmap to cv.matFromImageData.

I used TM_SQDIFF as suggested, and got some great results.
Haystack
haystack
Needle
needle
Result
result

I achieved this in the following way.

import { cv } from 'https://deno.land/x/opencv@v4.3.0-10/mod.ts';
import { Image } from 'https://deno.land/x/imagescript@v1.2.14/mod.ts';

export type Match = false | {
    x: number;
    y: number;
    width: number;
    height: number;
    center?: {
        x: number;
        y: number;
    };
};

export const match = async (haystackPath: string, needlePath: string, drawOutput = false): Promise<Match> => {
    const perfStart = performance.now()

    const haystack = await Image.decode(Deno.readFileSync(haystackPath));
    const needle = await Image.decode(Deno.readFileSync(needlePath));

    const haystackMat = cv.matFromImageData({
        data: haystack.bitmap,
        width: haystack.width,
        height: haystack.height,
    });
    const needleMat = cv.matFromImageData({
        data: needle.bitmap,
        width: needle.width,
        height: needle.height,
    });

    const dest = new cv.Mat();
    const mask = new cv.Mat();
    cv.matchTemplate(haystackMat, needleMat, dest, cv.TM_SQDIFF, mask);

    const result = cv.minMaxLoc(dest, mask);
    const match: Match = {
        x: result.minLoc.x,
        y: result.minLoc.y,
        width: needleMat.cols,
        height: needleMat.rows,
    };
    match.center = {
        x: match.x + (match.width * 0.5),
        y: match.y + (match.height * 0.5),
    };

    if (drawOutput) {
        haystack.drawBox(
            match.x,
            match.y,
            match.width,
            match.height,
            Image.rgbaToColor(255, 0, 0, 255),
        );
    
        Deno.writeFileSync(`${haystackPath.replace('.png', '-result.png')}`, await haystack.encode(0));
    }

    const perfEnd = performance.now()
    console.log(`Match took ${perfEnd - perfStart}ms`)

    return match.x > 0 || match.y > 0 ? match : false;
};

ISSUE

The remaining issue is that I also get a false match when it should not match anything.
Based on what I know so far, I should be able to solve this using a threshold like so:

cv.threshold(dest, dest, 0.9, 1, cv.THRESH_BINARY);

Adding this line after matchTemplate however makes it indeed so that I no longer get false matches when I don't expect them, but I also no longer get a match when I DO expect them.

Obviously I am missing something on how to work with the cv threshold. Any advice on that?

UPDATE 2

After experimenting and reading some more I managed to get it to work with normalised values like so:

cv.matchTemplate(haystackMat, needleMat, dest, cv.TM_SQDIFF_NORMED, mask);
cv.threshold(dest, dest, 0.01, 1, cv.THRESH_BINARY);

Other than it being normalised it seems to do the trick consistently for me. However, I would still like to know why I cant get it to work without using normalised values. So any input is still appreciated. Will mark this post as solved in a few days to give people the chance to discus the topic some more while it's still relevant.

  • play with the TM_* methods. use `TM_SQDIFF` to start. that'll give you something that isn't "normalized" to hell. it's a difference, i.e. measure of DISsimilarity, not a matching score. – Christoph Rackwitz Jul 30 '22 at 16:42
  • @ChristophRackwitz I did play around with these, and some indeed give way more matches than others. However the coordinates still don't seem to be correct. – S. Van den Wyngaert Jul 30 '22 at 18:21
  • stop before thresholding. all that is probably wrong. look at the result of matchTemplate as grayscale. use `cv::log` and normalize with MINMAX to view the whole range of values. make sure your question's code shows the TM_* that you actually use. -- yeah I'm editing my comments a lot, and I write before I'm done thinking. I'll try to reproduce from python. – Christoph Rackwitz Jul 30 '22 at 18:26
  • with TM_CCOEFF_NORMED, that needle and haystack, I'm getting this: https://i.stack.imgur.com/cw9sj.png (addWeighted on top of haystack) which obviously sucks because that method is silly. – Christoph Rackwitz Jul 30 '22 at 18:37
  • I updated the code using TM_SQDIFF and took a step back to matchTemplate, like you suggested. I'm trying to find my way through the docs, which is not trivial given I'm unexperienced with opencv, and using it in node/deno. The logResult returns false on empty, so it holds some data. How would I proceed from here? – S. Van den Wyngaert Jul 30 '22 at 18:45
  • yeah the docs won't help here, which is a shame. you need to build an intuition for what these TM_* do, from experience. the equations are awful, worse than code. all the CCORR ones will fail horribly if the "signal" has any "DC offset" (not OpenCV's fault, that's just what happens with simple multiplication). any real world applications require SQDIFF. if you aren't bothered by python, I think I'll just write an answer and demonstrate. – Christoph Rackwitz Jul 30 '22 at 18:49
  • Python works for me as a demonstration. The reason I picked deno for my project is to experiment with new js runtimes, Since I use node.js in professional context its something fresh for me, but still relevant. – S. Van den Wyngaert Jul 30 '22 at 18:52
  • oh, as for that `undefined`... I'd guess `imageSource` is undefined, which might indicate that jimp couldn't read those files. – Christoph Rackwitz Jul 30 '22 at 19:57
  • and I regret that I'm no help with the "opencv.js" variant of OpenCV. the interface is somewhat obscure, but probably on par with C++ (in terms of madness). python and numpy are a delight in comparison. – Christoph Rackwitz Jul 30 '22 at 20:05
  • Your answer is a great source for helping me understand what's going on. I'm indeed now figuring out the specifics on how to get it to work with opencv.js. I will update my progress as soon as it is relavant. Thanks for the effort so far! – S. Van den Wyngaert Jul 30 '22 at 20:07
  • when you use the `*_NORMED` modes, you don't have to think about the scale of values. you can get it to work using non-normalized values but then you'll have to know/calculate what range of values is achievable, or how the values behave for various kinds of image difference, and pick a threshold based on that. an indication for that is `maxscore` in my example code below. I only use it for the visualization because I set my thresholds using knowledge about the calculation (sum of squared differences) and some judgment about what's "a good match" (here: only tiny differences) – Christoph Rackwitz Aug 10 '22 at 00:37
  • 1
    I respect your battle with opencv.js. its "Mat" interface leaves much to be desired. there's `numjs` but that seems to have been very new back in 2017 when opencv.js was created. if you continue to need opencv.js, perhaps you'll push for or support improvements. the next GSoC is sure to happen and maybe they need mentors. – Christoph Rackwitz Aug 10 '22 at 00:43

1 Answers1

2

The TM_* methods of matchTemplate are treacherous. And the docs throw formulas at you that would make anyone feel dumb, because they're code, not explanation.

Consider the calculation of one correlation: one particular position of the template/"needle" on the "haystack".

All the CCORR modes will simply multiply elementwise. Your data uses white as "background", which is a "DC offset". The signal, the difference to white of anything not-white, will drown in the large "DC offset" of your data. The calculated correlation coefficients will vary mostly with the DC offset and hardly at all with the actual signal/difference.

This is what that looks like, roughly. The result of running with TM_CCOEFF_NORMED, overlaid on top of the haystack (with some padding). You're getting big fat responses for all instances of all shapes, no matter their specific shape.

TM_CCORR*

You want to use differences instead. The SQDIFF modes will handle that. Squared differences are a measure of dissimilarity, i.e. a perfect match will give you 0.

Let's look at some values...

(hh, hw) = haystack.shape[:2]
(nh, nw) = needle.shape[:2]
scores = cv.matchTemplate(image=haystack, templ=needle, method=cv.TM_SQDIFF)
(sh, sw) = scores.shape # will be shaped like haystack - needle

scores = np.log10(1+scores) # any log will do

maxscore = 255**2 * (nh * nw * 3)
# maximum conceivable SQDIFF score, 3-channel data, any needle
# for a specific needle:
#maxscore = (np.maximum(needle, 255-needle, dtype=np.float32)**2).sum()

# map range linearly, from [0 .. ~8] to [1 .. 0] (white to black)
(smin, smax) = (0.0, np.log10(1+maxscore))
(omin, omax) = (1.0, 0.0)
print("mapping from", (smin, smax), "to", (omin, omax))
out = (scores - smin) / (smax - smin) * (omax - omin) + omin

logarithmic view of SQDIFF scores

You'll see gray peaks, but some are actually (close to) white while others aren't. Those are truly instances of the needle image. The other instances differ more from the needle, so they're just some reddish shapes that kinda look like the needle.

Now you can find local extrema. There are many ways to do that. You'll want to do two things: filter by absolute value (threshold) and suppress non-maxima (scores above threshold that are dominated by better nearby score). I'll just do the filtering and pretend there aren't any nearby non-maxima because the resulting peaks fall off strongly enough. If that happens to not be the case, you'd see double drawing in the picture below, boxes becoming "bold" because they're drawn twice onto adjacent pixel positions.

I'm picking a threshold of 2.0 because that represents a difference of 100, i.e. one color value in one pixel may have differed by 10 (10*10 = 100), or two values may have differed by 7 (7*7 = 49, twice makes 98), ... so it's still a very tiny, imperceptible difference. A threshold of 6 would mean a sum of squared differences of upto a million, allowing for a lot more difference.

(i,j) = (scores <= 2.0).nonzero() # threshold "empirically decided"
instances = np.transpose([j,i]) # list of (x,y) points

That's giving me 16 instances.

canvas = haystack.copy()
for pt in instances:
    (j,i) = pt
    score = scores[i,j]

    cv.rectangle(canvas,
        pt1=(pt-(1,1)).astype(int), pt2=(pt+(nw,nh)).astype(int),
        color=(255,0,0), thickness=1)

    cv.putText(canvas,
        text=f"{score:.2f}",
        org=(pt+[0,-5]).astype(int),
        fontFace=cv.FONT_HERSHEY_SIMPLEX, fontScale=0.4,
        color=(255,0,0), thickness=1)

That's drawing a box around each, with the logarithm of the score above it.

results with bounding boxes


One simple approach to get candidates for Non-Maxima Suppression (NMS) is to cv.dilate the scores and equality-compare, to gain a mask of candidates. Those scores that are local maxima, will compare equal to themselves (the dilated array), and every surrounding score will be less. This alone will have some corner cases you will need to handle. Note: at this stage, those are local maxima of any value. You need to combine (logical and) that with a mask from thresholding the values.

NMS commonly is required to handle immediate neighbors being above the threshold, and merge them or pick the better one. You can do that by simply running connectedComponents(WithStats) and taking the blob centers. I think that's clearly better than trying to find contours.

The dilate-and-compare approach will not suppress neighbors if they have the same score. If you did the connectedComponents step, you only have non-immediate neighbors to deal with here. What to do is up to you. It's not clear cut anyway.

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
  • thank you for your in depth answer, it's been a great help in getting a better understanding of what's going on. I updated my question with my current progress. – S. Van den Wyngaert Jul 31 '22 at 15:09
  • Sorry it took me a while to get back to this question. I'm closing this and accepting your answer since it provided me with the information and understanding of what was going on. I managed to get the required result, only matching a single instance. Thanks for the great help! – S. Van den Wyngaert Aug 09 '22 at 20:28