0

I have several vector paths and a query path and now I am trying to get the path which is most similar to the query path. I can access length(perimeter) of each path, and width and height of their bounding boxes. I am using python and using pyx library for rendering SVG paths and calculating their bounding boxes. Pseudo code looks like...

THRESHOLD = //some value
qpath = //my query path
similar_paths = []

for path in path_list:
    if (comparable width and comparable height and comparable perimeters):
        similar_paths.append(path)

But It does not seem to give nice results. Any ideas on how to improve the results?

woot
  • 7,406
  • 2
  • 36
  • 55
Selie
  • 43
  • 6
  • What does "nice" mean? In what way is the algorithm not nice and what would a nice algorithm produce? Give examples of paths that don't match but should and that do match but should not. – Robert Longson Jun 21 '15 at 08:10
  • @RobertLongson Well nice means it should give similar paths. Say I have any zigzag path and a circle such their bounding boxes are of comparable sizes. Now I am getting such circles in my results which implies that with high probability their lengths are also comparable. – Selie Jun 21 '15 at 08:36
  • Also, I am thinking that I can choose few random points inside the query path and see if they also lie in the path that I am comparing. But for this I have no way of knowing if a point lies inside a svg path or not. – Selie Jun 21 '15 at 08:38
  • document.elementFromPoint(x, y) will tell you that if you can access the DOM from python. – Robert Longson Jun 21 '15 at 09:07
  • @RobertLongson I can not find any elementFromPoint function in python. Can you provide a link for it? – Selie Jun 21 '15 at 09:46
  • https://developer.mozilla.org/en-US/docs/Web/API/Document/elementFromPoint but as I said I don't know whether you can access the DOM from python. – Robert Longson Jun 21 '15 at 09:54

1 Answers1

1

Lets use a simple PyX graph to generate some paths: a pyx graph

The paths could also come from an SVG file read in parsed mode.

Once you have PyX paths, you can use PyX features to get further information about the paths. In the following simple version I calculate a few points along each path and the sum up their distance. (I do it using method names ending by _pt, which work in PostScript points. It is a little faster than using PyX units. Also I converted all paths to normpaths explicitely in the beginning. While this is not necessary, it helps reduces some function calls internally.)

Here is the full code (including the graph to generate the sample paths):

import math
from pyx import *

# create some data (and draw it)
g = graph.graphxy(width=10, x=graph.axis.lin(min=0, max=2*math.pi))
qpi = g.plot(graph.data.function("y(x)=sin(x)"))
opi1 = g.plot(graph.data.function("y(x)=sin(x)+0.1*sin(10*x)"))
opi2 = g.plot(graph.data.function("y(x)=sin(x)+0.2*sin(20*x)", points=1000))
g.writePDFfile()

# get the corresponding PyX paths
qpath = qpi.path.normpath()
opath1 = opi1.path.normpath()
opath2 = opi2.path.normpath()

# now analyse it
POINTS = 10

qpath_arclen_pt = qpath.arclen_pt()
qpath_points_pt = qpath.at_pt([qpath_arclen_pt*i/(POINTS-1) for i in range(POINTS)])

for opath in [opath1, opath2]:
    opath_arclen_pt = opath.arclen_pt()
    opath_points_pt = opath.at_pt([opath_arclen_pt*i/(POINTS-1) for i in range(POINTS)])
    print(sum(math.sqrt((qpoint_x_pt-opoint_x_pt)**2 + (qpoint_y_pt-opoint_y_pt)**2)
              for (qpoint_x_pt, qpoint_y_pt), (opoint_x_pt, opoint_y_pt) in zip(qpath_points_pt, opath_points_pt)))

The program just prints out:

25.381154890630064
56.44386644062556

which indicates, that the dashed lines is closer to the solid one than the dotted line.

You may also compare tangents, curvatures, the arclen itself etc. ... there are plenty of options depending on your needs.

wobsta
  • 721
  • 4
  • 5
  • Nice answer but still a bit vague. I am thinking of using shape descriptors instead. I think they will give good results. – Selie Jun 23 '15 at 20:57
  • Also, it is not rotation, scaling etc invariant. – Selie Jun 24 '15 at 09:21
  • You did not specify a metric to measure the similarity of the paths. Please specify it and I might be able to help implementing it using PyX paths. As an example, I just measured the distances between some points on the paths. This measure is rotation invariant (which can be tested by transforming the paths). It is not scaling invariant. Instead, when applying a scale to all paths, the result changes by the same factor. There are plenty of options to fix that, if needed, like dividing the result by the arclen of the query path. – wobsta Jun 24 '15 at 21:26
  • I did not understand about metric to measure for path similarity. Can you pls elaborate more? – Selie Jun 25 '15 at 14:36
  • A metric is a mathematical term. It is a function that defines a distance between a pair of elements of a set. In our case the elements are the paths and the set itself is the set of all possible paths. In my example I calculated a number of points along the paths and added up the euclidian distance of those points between two different paths I'm comparing. You might do something else, more complicated maybe, depending on your needs. Further discussion is really pointless, if you cannot specify your problem. As far as it has been specified, it is answered by my code already ... – wobsta Jun 26 '15 at 22:30