3

First question I've asked on stack and I'm pretty new to R, so please pardon any etiquette offenses. I'm plotting 2 stacked area charts using ggplot2. The data is wait events from an Oracle database. It's a performance tuning chart. I have several questions.

enter image description here

  1. The two plots below do not line-up correctly, most likely due to the width of text in the legend. Is there an easy solution to this?
  2. The two plots are really correlated, where the top plot shows wait classes like "CPU" and "User I/O" and the bottom plot shows the details of the specific wait events in those classes. I'd like the colors in the bottom to be based on the wait class, the same as the top, just different shades of that color for the specific events. I'm also open to other options if you don't like the concept. It's a lot of information to convey. I've limited the number of events to 12 to fit in the color scheme, but there are more if it can work.
  3. I'd like to either show more granular time ticks on the X, or perhaps even shade the off-business hours (6pm-8am) gray just to convey a better sense of time of day.
  4. Are there any color schemes with more than 12 colors people commonly use? Looked through brewer and this is the max. I know I could create my own, just curious.

Here's my code:

library(ggplot2)
library(RColorBrewer)
library(gridExtra)

DF_AAS <- read.csv('http://dl.dropbox.com/u/4131944/Permanent/R-Questions/AAS-Plot/DATA_FRAME_AAS.csv', head=TRUE,sep=",",stringsAsFactors=TRUE)
DF_AAS <- within(DF_AAS, snap_time <- as.POSIXlt(snap_times2,
                                          format = "%Y-%m-%d %H:%M:%S"))
DF_AAS[c('snap_times2')] <- NULL

DF_AAS_EVENT <- read.csv('http://dl.dropbox.com/u/4131944/Permanent/R-Questions/AAS-Plot/DF_AAS_EVENT.csv', head=TRUE,sep=",",stringsAsFactors=TRUE)
DF_AAS_EVENT <- within(DF_AAS_EVENT, snap_time <- as.POSIXlt(snap_times2,
                                                 format = "%Y-%m-%d %H:%M:%S"))
DF_AAS_EVENT[c('snap_times2')] <- NULL

plot_aas_wait_class <- ggplot()+
  geom_area(data=DF_AAS, aes(x = snap_time, y = aas,
                                    fill = wait_class),stat = "identity", position = "stack",alpha=.9)+
                                      scale_fill_brewer(palette="Paired",breaks = sort(levels(DF_AAS$wait_class)))+
                                      scale_y_continuous(breaks = seq(0, max(DF_AAS$aas)+(max(DF_AAS$aas)*.2), 5))+
                                      opts(panel.background = theme_rect(colour = "#aaaaaa"))


plot_aas_event <- ggplot()+
  geom_area(data=DF_AAS_EVENT, aes(x = snap_time, y = aas,
                                   fill = wait_class_event),stat = "identity", position = "stack")+
                                     scale_fill_brewer(palette="Paired",breaks = DF_AAS_EVENT$wait_class_event)+
                                     scale_y_continuous(breaks = seq(0, max(DF_AAS_EVENT$aas)+(max(DF_AAS_EVENT$aas)*.2), 5))+
                                     opts( panel.background = theme_rect(colour = "#aaaaaa"))

grid.arrange(arrangeGrob(plot_aas_wait_class, plot_aas_event),heights=c(1/2,1/2),ncol=1)
Andrie
  • 176,377
  • 47
  • 447
  • 496
Tyler Muth
  • 63
  • 5
  • 1
    My only SO etiquette comments would be that (1) we generally ask that people restrict themselves to one question per question, and (2) we ask that you provide _reproducible_ data and code. That way when we write an answer, we can be _sure_ that it works the way you want. Otherwise we're just guessing, and you're asking people to do a lot more work. – joran May 11 '12 at 16:06
  • 1
    As for (4), the reason those palette's are restricted to 12 colors is because using more is considered bad practice. The human eye simply can't distinguish that many, even 12 is pushing it. You can of course create your own and do whatever you want. – joran May 11 '12 at 16:08
  • @joran I did provide reproducible data. The read.csv references a public URL. If that's a problem, let me know. I figured multiple questions was probably bad form, but since I'm pretty new to R I didn't know if one might influence the answer to the other. I'm in the "I don't know what I don't know" stage of learning R. Agree with you on the number of colors. I'm open to other visualizations, just haven't come up with a good alternative. – Tyler Muth May 11 '12 at 16:22
  • Nope, my bad! Didn't notice the dropbox bit. – joran May 11 '12 at 16:23

1 Answers1

1

Possibly the easiest solution to the alignment problem is to move the legends around:

library(scales)
plot_aas_wait_class <- ggplot()+
  geom_area(data=DF_AAS, aes(x = snap_time, y = aas,fill = wait_class),stat = "identity", position = "stack",alpha=.9)+
  scale_fill_brewer(palette="Paired",breaks = sort(levels(DF_AAS$wait_class)))+
  scale_y_continuous(breaks = seq(0, max(DF_AAS$aas)+(max(DF_AAS$aas)*.2), 5))+
  opts(panel.background = theme_rect(colour = "#aaaaaa")) +  
  opts(legend.position = "bottom",legend.direction = "horizontal") + 
  guides(fill = guide_legend(nrow = 2))

plot_aas_event <- ggplot()+
  geom_area(data=DF_AAS_EVENT, aes(x = snap_time, y = aas,fill = wait_class_event),stat = "identity", position = "stack")+
  scale_fill_brewer(palette="Paired",breaks = DF_AAS_EVENT$wait_class_event)+
  scale_y_continuous(breaks = seq(0, max(DF_AAS_EVENT$aas)+(max(DF_AAS_EVENT$aas)*.2), 5))+
  opts( panel.background = theme_rect(colour = "#aaaaaa")) +  
  opts(legend.position = "bottom",legend.direction = "horizontal") + 
  guides(fill = guide_legend(ncol = 2))


grid.arrange(arrangeGrob(plot_aas_wait_class, plot_aas_event),heights=c(1/2,1/2),ncol=1)

To increase the resolution on the x axis,I'd use something like:

+ scale_x_datetime(breaks = date_breaks("2 hours"))

or whatever breaks you'd prefer.

Shading a particular region is typically done with geom_rect and setting alpha = 0.25 or something. This would require creating a separate data frame with the start and end points of the rect (use Inf and -Inf for the y coordinates) to pass to geom_rect.

joran
  • 169,992
  • 32
  • 429
  • 468