5

I am trying to compare a Polyline - overview_polyline ruturned by Google Directions API with a set of already existing Polylines and see which part of the new polyline already contains within one of these polylines. For me polyline is a driving route representation, retrieved from Google Directions API. It is basically any route anywhere in the world. Thou for simplification we can always find routes, which belong to a concrete city or a country and compare only thise. Also, at the moment it may be at most 250kms long. Here is some example:

enter image description here

enter image description here

It doesn't matter which route is existing here and which is new. In any case I would like to get the result, that this routes are similar (ok, may be they are not 90% similar, but lets assume they are).

At the moment I am using brute forcing to compare new polyline one by one with an existing polyline. Before that I am splitting polylines into points using this algorithm and compare each point to see, if there is a match. I treat points to be the same if distance between this points is less then 100 meters.

If I found that there is already some polyline, which mostly covers new polyline, I stop processing.

It looks like this:

Polyline findExistingPolyline(Polyline[] polylines, Polyline polyline) {
  LatLng[] polylinePoints = PolylineDecoder.toLatLng(polyline);
  for (Polyline existing: polylines) {
    LatLng[] existingPoints = PolylineDecoder.toLatLng(existing);
    if (isMostlyCovered(existingPoints , polylinePoints)) {
       return existing;
    }
  }

  return null;

}

boolean isMostlyCovered(LatLng[] existingPoints, LatLng[] polylinePoints) {
  int initialSize = polylinePoints.length;
  for (LatLng point: polylinePoints) {
    for (LatLng existingPoint: existingPoints) {
       if (distanceBetween(existingPoint, point) <= 100) {
         polylinePoints.remove();// I actually use iterator, here it is just demosnstration
       }
    }
  }
  // check how many points are left and decide if polyline is mostly covered
  // if 90% of the points is removed - existing polylines covers new polyline
  return (polylinePoints.length * 100 / initialSize) <= 10;
}

Obviously, this algorithm sucks (especially in its worst case, when there is no match for new polyline) as there are two many cycles and may be too many points to compare.

So, I was wondering, if there is more efficient approach to compare polylines with each other.

Alex K.
  • 3,294
  • 4
  • 29
  • 41
  • That's an interesting problem, but could you please provide more information, or better yet, a real example? For example, what does a typical polyline look like? Is it a path within a city or across the world? Can the polylines you compare to be precalculated or do they change often and have to be updated on the fly? Can you store additional values with each polyline, for example lat/lon range covered and line length, so that you can rule out obvious mismatches early? – M Oehm Mar 01 '14 at 08:48
  • Thank you for your comment. Routes are static and can't be updated. I can store additional information with it and I have its length. So, yes I can filter out routes which have distance between start points longer, than the sum of their lengths. But this would not dramatically improve this algorithm. – Alex K. Mar 01 '14 at 16:55
  • getting error on PolylineDecoder, I have to write this class by my own or there is some import for it? – shehzy Oct 16 '14 at 12:02
  • I think I used this one (slightly modified): http://www.geekyblogger.com/2010/12/decoding-polylines-from-google-maps.html – Alex K. Oct 16 '14 at 16:17
  • @AlexanderGavrilov this class produces an exception(StringIndexOutOfBoundsException) on line "b = encoded.charAt(index++) - 63;" for second do while loop, how did you handle this? thanks – shehzy Oct 17 '14 at 07:56
  • Im also trying to accomplish the same task. Were u successful in implementing ur algorithm? – zoram Mar 04 '15 at 06:31
  • @zoram, unfortunately I couldn't come up with appropriate algorithm in reasonable amount of time, so I skipped its implementation for the time being. – Alex K. Mar 04 '15 at 12:06

1 Answers1

2

You seem to compare only the points of the polylines, not the lines in between. That means that a straight line and the same line with an additional center point won't match. Or am I missing something? (If my assumption is right, that's the weak point in your method, I think.)

The distance calculation you use involves ellipsoid trigonometry and probably is expensive. You don't need exact measures here, though, you just want to match two nodes. If you need to cover a well-known range that's not close to a pole, you could consider lat/lon as flat coordinates, maybe with a correction to the longitude.

boolean isWithin100m(LatLng a, LatLng b) {
    double dy = (a.lat - b.lat) * R * pi / 180.0;

    if (dy < -100 || dy > 100) return false;

    double dmid = 0.5 * (a.lat + b.lat) * pi / 180.0;
    double dx = (a.lng - b.lng) * R * pi / 180.0 / cos(dmid);
    return dx*dx + dy*dy <= 10000.0;
}

Here, R is Earth's radius. That method should be faster than your exact solution. If the cosines of your northernmost and southernmost points are similar, you could even leave them out and just add a fixed avarage cosine as constant factor to the longitude.

Also, you decode your new polyline with every comparison. You could do that only once in findExistingPolyline and pass a LatLng[] to isMostlyCovered. If you can precalculate data for your existing polylines, storing them as LatLng[] would also help. Keeping the extreme latitudes and longitudes for each polyline and maybe a line length can help you to rule out obvious mismatches early on.

Maybe you should even go beyond that: Along with longitude and latitude, store Earth-Centered, Earth-fixed coordinates and keep them in a k-d Tree for easy closest-neighbour lookup. That's my bet for the best speedup of your algorithm, at the cost of extra data.

And it's probably better not to create a new list for each polyline and then delete from it but to keep the lists intact and keep a local "used" set which should be quicker to look up than deleting points.

M Oehm
  • 28,726
  • 3
  • 31
  • 42
  • Thank you for your response. I don't need great precision to find distance between two points. For me it would be OK even, if I could just compare points using equals. Unfortunately, Directions API may not return always exactly same coordinates, that is why I just use distanceBetween. Also, you are right regarding decoding polyline each time, I already had this, but didn't write this example code correctly, because I had no code at my fingertips. And I would like not to store additional data with route, if possible. Only if it would really speed up the whole process. (I use appengine). – Alex K. Mar 01 '14 at 17:01
  • Thank for the additional information. The points in the polyline are the blue pins, right? Then I don't see how your algorithm can compare similarities between the shown paths. As far as I can see, the only equivalent points are at Elmhurst and S La Grange. If the paths ended at Elmhurst, that would be 2 matched point to 4 in total, i.e. a match of 50% where the paths are nearly similar after Bolingbrook and are more lik 80% similar. Or does the decompression calculate intermediate points? – M Oehm Mar 01 '14 at 18:02
  • Yes, it compares points between the very first blue point and the last one. Blue ones are just waypoints, used to determine where the route should go through. Route has minimum 2 waypoints. And Google Directions API return a String representation of this route, which can be decoded into the coordinate, which I actually compare. And the route does not necessarily consist of straight lines. It can have curves as well. – Alex K. Mar 01 '14 at 19:06
  • how to apply this method for dart – gsm Jan 30 '21 at 05:25