1

I have data collected at three sites, where in each these data were collected several times for several subjects.

Here's how the data look like:

set.seed(1)
df <- data.frame(site = c(rep("AA",1000),rep("BB",500),rep("CC",750)),
                 y = c(rnorm(1000,1,2),runif(500,1,3),rgamma(750,shape=1)))

#add subjects - using a function that randomly generates
#a number of subjects that adds up to their total at that site

site_a_subjects <- diff(c(0, sort(20*sample(19)), 1000))
site_b_subjects <- diff(c(0, sort(30*sample(9)), 500))
site_c_subjects <- diff(c(0, sort(40*sample(4)), 750))

#add these subjects
df$site_subjects <- c(unlist(sapply(1:20, function(x) rep(letters[x], site_a_subjects[x]))),
                  unlist(sapply(1:10, function(x) rep(letters[x], site_b_subjects[x]))),
                  unlist(sapply(1:5, function(x) rep(letters[x], site_c_subjects[x]))))

I want to plot a histogram of y per each site. This ggplot2 simple line achieves that easily:

ggplot(df, aes(x=y)) + geom_histogram(colour="black", fill="white") + facet_grid(. ~ site)

However, I additionally want to plot at each site histogram, a subplot which is a histogram of the counts of the number of each subject observation at that site. Something like adding:

hist(table(df$site_subjects[which(df$site == "AA")]))
hist(table(df$site_subjects[which(df$site == "BB")]))
hist(table(df$site_subjects[which(df$site == "CC")]))

to the three site histograms, respectively.

Any idea how can that be done?

I wonder if annotation_custom can be tweaked to achieve this?

This code will work, but only if the:

ggplotGrob(ggplot(df, aes(x=site_subjects)) + geom_bar() + theme_bw(base_size=9))

command could accept a list of ggplot objects or something like that.

here's the 'almost; solution: First figure out what is the maximum bar height among all facet histograms

ymax <- max(sapply(unique(df$site), function(x) max(hist(df$y[which(df$site == x)],plot=FALSE)$counts)))

Then:

main.plot <- ggplot(df, aes(x=y)) + geom_histogram(colour="black", fill="gray") + facet_grid(~site) + scale_y_continuous(limits=c(0,1.2*ymax))
main.plot.info <- ggplot_build(main.plot)
xmin <- min(main.plot.info$data[[1]]$x[which(main.plot.info$data[[1]]$PANEL == 1)])
xmax <- max(main.plot.info$data[[1]]$x[which(main.plot.info$data[[1]]$PANEL == 1)])
main.plot <- main.plot + annotation_custom(grob = grid::roundrectGrob(),xmin = xmin, xmax = xmax, ymin=ymax, ymax=1.2*ymax)
sub.plot <- ggplotGrob(ggplot(df, aes(x=site_subjects)) + geom_bar() + theme_bw(base_size=9))
combined.plot <- main.plot +  annotation_custom(grob = sub.plot, xmin = xmin, xmax = xmax, ymin=ymax, ymax=1.2*ymax)

And the result is: enter image description here

user1701545
  • 5,706
  • 14
  • 49
  • 80

2 Answers2

2

One way to do this is to create the main plot and then add each inset plot by creating viewports at each of the locations where you want an inset plot. We use functions from the grid package for these operations. Here's an example:

library(grid)

# Function to draw the inset plots 
  pp = function(var) {
    grid.draw(
      ggplotGrob(
        ggplot(df[df$site==var,], aes(site_subjects)) +
          geom_bar() +
          theme_bw(base_size=9)
      )
    )
  }  

# Function to place the viewports on the main graph
my_vp = function(x) {
  viewport(x=x, y=.8, width=0.25, height=0.2)
}

# Main plot
ggplot(df, aes(x=y)) + geom_histogram(colour="black", fill="white") + 
  facet_grid(. ~ site) +
  scale_y_continuous(limits=c(0,400))

# Draw each inset plot in a separate viewport
vp = my_vp(0.22)
pushViewport(vp)
pp("AA")
popViewport()

vp = my_vp(0.52)
pushViewport(vp)
pp("BB")
popViewport()

vp = my_vp(0.84)
pushViewport(vp)
pp("CC")

enter image description here

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • Nice!!! Is there a way to predetermine the x locations of the facets of the main plot so they can be passed to my_vp instead of hard-coding the values? – user1701545 Mar 02 '16 at 06:23
  • And also, how would I change this code so that I can plot the figure to a file? – user1701545 Mar 02 '16 at 06:53
  • For your second question, right before you start the main plot do `tiff("myplot.tiff", 1000, 800)` (or whatever file name and resolution you want) then after adding all the inset plots do `dev.off()`. – eipi10 Mar 02 '16 at 15:04
  • For your first question, I'm sure there's a way, but my knowledge of `grid` and of the under-the-hood structure of ggplot objects isn't quite deep enough yet to figure it out. You could try asking about this on the `ggplot2` google group or hope that @baptiste sees this question and provides some ideas. – eipi10 Mar 02 '16 at 15:17
0

Here's something reasonable:

ymax <- max(sapply(unique(df$site), function(x) max(hist(df$y[which(df$site == x)],plot=FALSE)$counts)))
sites <- unique(df$site)
plot.list <- sapply(sites, function(s) {
  main.plot = ggplot(df[which(df$site == s),], aes(x=y)) + geom_histogram(colour="black", fill="gray") + scale_y_continuous(limits=c(0,1.5*ymax))
  main.plot.info = ggplot_build(main.plot)
  xmin = min(main.plot.info$data[[1]]$x[which(main.plot.info$data[[1]]$PANEL == 1)])
  xmax = max(main.plot.info$data[[1]]$x[which(main.plot.info$data[[1]]$PANEL == 1)])
  sub.plot = ggplotGrob(ggplot(df[which(df$site == s),], aes(x=site_subjects)) + geom_bar() + theme_bw(base_size=9))
  return(ggplotGrob(main.plot + annotation_custom(grob = sub.plot, xmin = xmin, xmax = xmax, ymin=0.8*ymax, ymax=1.2*ymax)))})

grid.arrange(grobs=plot.list, ncol=3)
user1701545
  • 5,706
  • 14
  • 49
  • 80