R: Find a shape from a point cloud

Question

I have a point cloud like such below

df <- data.frame(x=c(2,3,3,5,6,2,6,7,7,4,3,8,9,10,10,12,11,12,14,15),
              y=c(6,5,4,4,4,4,3,3,2,3,7,3,2,3,4,6,5,5,4,6))
plot(df,xlab="",ylab="",pch=20)

Think of them as gps coordinates of movement by an animal. I would like to find the spatial area covered by the points (animal). The most obvious solution is a convex hull which produces this:

df1 <- df[chull(x = df$x,y=df$y),]
polygon(x = df1$x,df1$y)

But this is not the result I am looking for. The movement area is not a closed geometric shape, but rather a boomerang kind of shape. The convex hull covers a lot of area not covered by the animal thereby overestimating the area. I am looking for something like this:

Of course, this is a mock dataset to give an idea. The original datasets have lot more points and varying geometries in point cloud. I was thinking along the lines of DBSCAN or minimum spanning networks, but they don't quite work.

I am not sure how to describe this geometrically or mathematically. If anyone has any ideas on how to approach this (even if it's not a full solution), I would very much appreciate that. If anyone has a better title for this question, that would be nice too :-) Thanks.

Update ----------------------------------------------------------------

Plot of (minimum spanning tree) MST. I think this might be in the right direction.

library(ape)
d <- dist(df)
mstree <-mst(d)
plot(mstree, x1 = df$x, x2 = df$y)

Turns out this problem is about computing an optimal concave hull for a set of points. Here is a python implementation. http://blog.thehumangeo.com/2014/05/12/drawing-boundaries-in-python/ — mindlessgreen, Oct 24 '15 at 20:18

score 1 · Accepted Answer · answered Oct 23 '15 at 10:13

Try alphahull

library(alphahull)

p <- ahull(df$x, df$y, alpha = 2.5)
plot(p)

Still, purely geometric tricks like this are rarely helpful for animal tracking data. It's too ad hoc to be applicable for other cases, doesn't have anything for the temporal component or information about the environment or the uncertainty of the locations or the relationship between the point samples and the real track etc etc.

score 0 · Answer 2 · answered Oct 23 '15 at 09:42

0

library(geometry)
polyarea(df$x, df$y)
[1] 18.5

This requires the right order though.

answered Oct 23 '15 at 09:42

JohannesNE

1,343
9
14

I think the OP is not looking for the value of the area but for the shape. – Oct 23 '15 at 09:50
I am sorry the title is confusing. It's not just about finding the area of a polygon. Finding the correct polygon in the first place is the issue. – mindlessgreen Oct 23 '15 at 09:58
Ahh right. How about assigning a specific area to each point, and then subtracting the overlap. – JohannesNE Oct 23 '15 at 15:58

score 0 · Answer 3 · answered Oct 24 '15 at 10:40

You might want to consider an approach based on TSP heuristics. Such approaches are near ideal when all points are relevant.

Below is a simple approach extended from the insertion heuristic for TSP that might be workable, but it's O(N^2) or worst unless you rather careful with the data structure. The link gives the following for the heuristic description of the convex hull method.

Convex Hull, O(n^2*log^2(n))

Find the convex hull of our set of cities, and make it our initial subtour.

For each city not in the subtour, find its cheapest insertion (as in step 3 of Nearest Insertion). Then chose the city with the least cost/increase ratio, and insert it.

Repeat step 2 until no more cities remain.

In this case, the cities are the data points, and since the goal isn't to connect to all of the data points but rather get the general shape, an extra step is needed to determine when a data point either shouldn't be added or is no longer needed and can be removed. The issue though is that it's not clear what what points would be considered irrelevant.

This TSP Test Data site should give you an idea of what the results of that heuristic will be, and how you want to go about removing points form the resulting "tour", which you consider irrelevant.

Although possibility solution is to keep track of the original convex hull, and limit the increase in distance between two adjacent hull points to some (relatively small) multiple of the original distance between the hull points, which is similar to how alpha hulls work. This would prevent shapes such as the one at the bottom of this, TSP Test Case BCL380, by limiting the distance that can be traveled between two hull points.

R: Find a shape from a point cloud

3 Answers3