Approximate match using integer comparison

Question

I am looking for an algorithm that can match the saved pattern to the current pattern even if it is not exactly the same. For example, the saved pattern x is 0, 400, 900, 1500, 2000 and the current pattern y is 0, 300, 800, 1300, 1800.

Is there an algorithm that can match x and y even if they are not an exact match?

Or do I need to apply a set of distances that if the difference of the x and y is <= to the distance, then set to true otherwise false?

This is a knock detecting door lock. The value of x and y is the time interval between the knocks. I want an algorithm that can approximate the current pattern to the saved pattern even if it is not an exact match. Because it's hard to repeat the same knock with a specific time interval of your knocks.

We'll need much more information before we can attempt to answer you. All I can say right now is that regex **alone** won't meet your needs. — Alan Moore, Oct 11 '15 at 00:41
I want an algorithm that can compare the value of x to y and the system will consider true if the value of Y is close to the value of X. — Genio21, Oct 11 '15 at 01:11
Forget all about using regex for this, Your idea of using an array and limiting values is way better. — Jongware, Oct 11 '15 at 01:32
By the way, I finished my project with your ideas. Thank you very much. :) — Genio21, Mar 23 '16 at 08:21

Giuseppe Ricupero · Answer 1 · 2015-10-11T04:47:17.540

If i understand correctly what do you want i advise you to simply calculate the total difference between the expected and the actual timing of each knock. Neither the regex nor a string distance algorithm (such as Levenshtein distance) can provide you with more accurate results.

This little python script below use both absolute and relative approach (it already consider also missing actualKnocks):

#!/usr/bin/python

expectedKnocks = [0, 400, 900, 1500, 2000]
actualKnocks = [0, 300, 800, 1300, 1800]

# absolute approach
tolerance = 500
totalDifference = 0

# relative approach
relativeTolerance = 0.15  # 15%
errorRate = 0

for (i, item) in enumerate(expectedKnocks):
    if i < len(actualKnocks):
        totalDifference += (item - actualKnocks[i])
        if (item > 0):
            errorRate += (totalDifference / float(item))/len(expectedKnocks)
    else:
        totalDifference += item
        errorRate += 100.0 / len(expectedKnocks)

if (totalDifference <= tolerance):
    print "[Absolute] OK, come in.",  # ',' prevent newline
else:
    print "[Absolute] Go away!",
print "Absolute time diff %s under %s" % (totalDifference, tolerance)

if (errorRate <= relativeTolerance):
    print "[Relative] OK, come in.",
else:
    print "[Relative] Go away!",
print "Relative time diff %.2f%% under %s%%" % (errorRate, relativeTolerance)

The absolute approach is simply the total (ms?) difference between all expected and actual knocks. In the relative approach the script calculate the relative error for each knock couple making it proportional to the number of expected knocks.

By the way, I finished my project with your ideas. Thank you very much. :) — Genio21, Mar 23 '16 at 08:21
@Genio21: happy to hear from you, consider accepting the answer you consider correct or reporting yours for future reference! — Giuseppe Ricupero, Mar 24 '16 at 17:42

score 0 · Answer 2 · answered Oct 12 '15 at 12:04

0

This looks like a case for a nearest neighbor search, based on a distance function in a 5D continuous space. Have a look there: https://en.wikipedia.org/wiki/Nearest_neighbor_search. No need for a sophisticated solution if you have few reference patterns (maybe just one ?).

You can consider Euclidean or Manhattan distance.

From a reference test set, you should decide what is considered a match and what isn't in order to define a tolerance threshold.

answered Oct 12 '15 at 12:04

By the way, I finished my project with your ideas. Thank you very much. :) – Genio21 Mar 23 '16 at 08:21
@Genio21: glad to know. – Mar 23 '16 at 08:46

Approximate match using integer comparison

2 Answers2