4

Problem definition

I need to produce a number of specific graphs, and on these graphs, highlight subsets of vertices (nodes) by drawing a contour/polygon/range around or over them (see image below).

  • A graph may have multiple of these contours/ranges, and they may overlap, iff one or more vertices belong to multiple subsets.
  • Given a graph of N vertices, any subset may be of size 1..N.
  • However, vertices not belonging to a subset must not be inside the contour (as that would be misleading, so that's priority no. 1). This is gist of my problem.
  • All these graphs happen to have the property that the ranges are continuous, as the data they represent covers only directly connected subsets of vertices.
  • All graphs will be undirected and connected (no unconnected vertices will ever be plotted).

Reproducible attempts

I am using R and the igraph package. I have already tried some solutions, but none of them work well enough.

First attempt, mark.groups in plot.igraph:

library(igraph)
g = make_graph("Frucht")
l = layout.reingold.tilford(g,1)
plot(g, layout=l, mark.groups = c(1,3,6,12,5), mark.shape=1)  
 # bad, vertex 11 should not be inside the contour
plot(g, layout=l, mark.groups = c(1,6,12,5,11), mark.shape=1) 
 # 3 should not be in; image below
 # just choosing another layout here is not a generalizable solution

enter image description here

The plot.igraph calls igraph.polygon, which calls convex_hull (also igraph), which calls xspline. The results is, from what I understand, something called a convex hull (which otherwise looks very nice!), but for my purposes that is not precise enough, covering vertices that should not be covered.

Second attempt with contour. So I tried implementing my own version, based on the solution suggested here:

library(MASS)
xx <- runif(5, 0, 1);yy <- abs(xx)+rnorm(5,0,0.2) 
plot(xx,yy, xlim=c( min(xx)-sd(xx),max(xx)+sd(xx)), ylim =c( min(yy)-sd(yy),    max(yy)+sd(yy))) 
dens2 <- kde2d(xx, yy, lims=c(min(xx)-sd(xx), max(xx)+sd(xx), min(yy)- sd(yy), max(yy)+sd(yy) ),h=c(bandwidth.nrd(xx)/1.5, bandwidth.nrd(xx)/ 1.5), n=50 ) 
contour(dens2, level=0.001, col="red", add=TRUE, drawlabels=F) 

The contour plot looks in principle like something I could use, given enough tweaking of the bandwidth and level values (to make the contour snug enough so it doesn't cover any points outside the group). However, this solution has the drawback that when the level value is too small, the contour breaks (doesn't produce a continuous area) - so if I would go that way, controlling for continuity (and determining good bandwidth/level values on the fly) automatically should be implemented. Another problem is, I cannot quite see how could I plot the contour over the plots produced by igraph: the layout.* commands produce what looks like a coordinate matrix, but the coordinates do not match the axis coordinates on the plot:

# compare:
layout.reingold.tilford(g,1)
plot(g, layout=l, axes=T)

The question:

What would be a better way to achieve the plotting of such ranges on graphs (ideally igraphs) in R that would meet the criteria outlined above - ranges that include only the vertices that belong to their subset and exclude all else - while being continous ranges?

The solution I am looking for should be scalable to graphs of different sizes and layouts that I may need to create (so hand-tweaking each graph by hand using e.g. tkplot is not a good solution). I am aware that on some graphs with some vertex groups, meeting both the criteria will indeed be impossible in practise, but intuitively it should be possible to implement something that still works most of the time with smallish (10..20 vertices) and not-too-complex graphs (ideally it would be possible to detect and give a warning if a perfectly fitting range could not be plotted). Either an improvement of the mark.groups approach (not necessarily within the package, but using the hull-idea mentioned above), or something with contour or a similar suitable function, or suggesting something else entirely would be welcome, as long as it works (most of the time).

Update stemming from the discussion: a solution that only utilizes functions of core R or CRAN packages (not external software) is desirable, since I will eventually want to incorporate this functionality in a package.

Edit: specified the last paragraph as per the comments.

Community
  • 1
  • 1
user3554004
  • 1,044
  • 9
  • 24
  • Since hand-tweaking is not acceptable, then you should be making a feature request from the package author. – IRTFM Oct 17 '15 at 16:34
  • I would be happy to implement a function of my own, or utilize some suitable geometric function from another package to accomplish this, if somebody could get me started in the right direction (at least). I would suppose if one were to make a request to the author, even if they agree, one would still have to wait until the next version of the package. – user3554004 Oct 17 '15 at 19:40
  • Trying making mark.groups as a list : Example marked.groups= list(c(1,3,5,6,11,12),c(1,5,6,11,12))) – Oliver Aug 01 '18 at 01:11

1 Answers1

0

The comment area is not long enough to fit my answer there, so I'm putting this here, although I'd rather post it as a comment as it is not a full solution.

Quite a long throw, but the first thing that popped into my mind is support vector machines. The idea would be that you construct a support vector machine classifier that classifies your points into two groups (in or out) based on the coordinates of the vertices, using some non-linear kernel function (I would try the radial basis function). Then you plot the separating hyperplane of the trained support vector machine. One drawback is that the area that you obtain this way might be unbounded (i.e. go to infinity in some directions), so this idea definitely requires some further thinking, but at least that's one possible direction to go.

Tamás
  • 47,239
  • 12
  • 105
  • 124
  • Thanks for getting the discussion started! But as much I know of SVMs, it seems like trying to hit a fly with a cannon, not to mention the unboundedness (in a sense it just gives the inverse of the convex hull problem). All I need is something similar to what `convex_hull` in `mark.groups` does, but have the shape exclude vertices it's not meant to cover. An idea I had in the meanwhile: cover each pair of v's in a range with a rectangle along the edge connecting the 2, then merge/dissolve the rectangles of one range into a single polygon; haven't found a way to do that yet in R though; ideas? – user3554004 Oct 19 '15 at 20:26
  • Polygon unions seem to be implemented in the GEOS library and it has an [R interface](https://cran.r-project.org/web/packages/rgeos/index.html) - so all it takes is to generate the rectangles as polygons, then use `gUnion{rgeos}` to calculate their unions. Finally, some smoothing is probably needed for the corners to make the result look nice. (A word of warning: I am not an R expert and have never used `rgeos` myself). – Tamás Oct 19 '15 at 20:49
  • As for corners, true; what I had in mind were round-cornered-rectangles. Anyhow, I looked up GEOS, sounds good, but the problem is that it seems to require the separate installation of an external program/library. I will want to develop the prototype here into a package in the (very) long run, so package dependencies would be fine, but I don't think requiring users to install separate programs would be good practice. Conceptually it seems such a simple idea (then again geometry is not my strong suit), I wonder if it couldn't be done just within R... – user3554004 Oct 19 '15 at 21:08
  • 1
    Another alternative for polygon operations is the `gpc` library and its [R interface](https://cran.r-project.org/web/packages/gpclib/index.html). The `gpc` library has a more restrictive license, though, so you need to accept it before starting to use it. – Tamás Oct 20 '15 at 20:09
  • Unfortunately it doesn't have binaries and needs compilation, which would bring a hassle in the installation, if I were to use it as a dependency. However, while looking into it, I came across `polyclip`, which at first glance seems to suit my needs. I'd have to develop my own code to create the building blocks of the ranges on the graphs (along the edges), and then use the union operation to merge them. Will try that out, but +1 for pointing me in the right direction. – user3554004 Oct 21 '15 at 10:48