4

Context: I want to create an interactive heatmap with areas separated by a ZIP code. I've found no way of displaying it directly (i.e. using Google Maps or OSM), so I want to create curves or lines that are separating those areas, and visualize it in maps.

I have a set of points, represented by their coordinates and their according class (ZIP code). I want to get a curve separating them. The problem is that these points are not linearly separable.

I tried to use softmax regression, but that doesn't work well with non-linearly separable classes. The only methods I know which are able to separate non-linearly are nearest neighbors and neural networks. But such classifiers only classify, they don't tell me the borders between classes.
Is there a way to get the borders somehow?

zdevaty
  • 621
  • 1
  • 8
  • 16

1 Answers1

3

If you have a dense cloud of known points within each Zip code with coordinates [latitude. longitude, zip code], using machine learning to find the boundary enclosing those points sounds like overkill.

You could probably get a good approximation of the boundary by using computational geometry, e.g finding the 2D convex hull of each Zip code's set of points using the Matlab convhull function

K = convhull(X,Y)

The result K would be a vector of points enclosing the input X, Y vector of points, that could be used to draw a polygon.

The only complication would be what coordinate system to work in, you might need to do a bit of work going between (lat, lon) and map (x,y) coordinates. If you do not have the Matlab Mapping Toolbox, you could look at the third party library M_Map M_Map home page, which offers some of the same functionality.

Edit: If the cloud of points for Zip codes has a bounding region that is non convex, you may need a more general computational geometry technique to find a better approximation to the bounding region. Performing a Voronoi tesselation of the region, as suggested in the comments, is one such possibility.

Community
  • 1
  • 1
paisanco
  • 4,098
  • 6
  • 27
  • 33
  • Thank you, that seems like a good solution. And I didn't realize the problem with coordinates, however, I wonder whether I need to translate the coordinates when I'm operating on a relatively small area (Czech Republic). I guess I'll see that. – zdevaty Feb 04 '18 at 17:48
  • 1
    The convex hull would return regions that are convex. ZIP code areas usually are not. A more suitable approach is using a Voronoi tesselation of all points, keeping each line that separates points with different ZIP code. – Cris Luengo Feb 04 '18 at 20:26
  • 1
    That's another thing to look at, good point - depending on how good an approximation the OP wants to the area. The point being that computational geometry, not machine learning, is the way to approach this problem. IIRC there is support for Voronoi tesselation in vanilla Matlab as well. – paisanco Feb 04 '18 at 20:29
  • Rough estimate would be good enough for me, but the area I'm focused on is not convex at all, so it would overlap a lot. Seems like the Voronoi tesselation would be much better fit. Thank you, @Cris Luengo ! – zdevaty Feb 05 '18 at 10:46
  • I used machine learning because I studied it lately, and I didn't know where to look for better solutions. – zdevaty Feb 05 '18 at 10:48