1

My question is similar to this link. In this example, 1 is coded for mutation, 0 for wildtype and NA for not available. My dataframe is set up identically, however it may contain two or more types of mutation per gene. I would like to generate a similar figure, except in cases where a gene has two types of mutations, I would like the square to be cut in half and both types of mutations to be colored in, similar to this example. Currently if a gene in a subject has two mutations, the second mutation fill over writes the first. Thank you in advance for taking out the time to help.

dat <- expand.grid(gene=1:10, subj=1:50)
dat$mut <- as.factor(sample(c(rep(0,300),rep(1,200)),500))
dat$mut[sample(500,300)] <- NA
dat[501,] = c(10,50,1) #included from comment below
ggplot(dat, aes(x=subj, y=gene, fill=mut)) +
  geom_raster() +
  scale_fill_manual(values = c("#8D1E0B","#323D8D"), na.value="#FFFFFF") +
  scale_x_discrete("Subject") +
  scale_y_continuous(breaks=1:10,
    labels=c("D0","D1","D2","D3","D4","D5","D6","D7","D8","D9")) +
  guides(fill=FALSE) +
  theme(
    axis.ticks.x=element_blank(), axis.ticks.y=element_blank(),
    axis.text.x=element_blank(), axis.text.y=element_text(colour="#000000"), 
    axis.title.x=element_text(face="bold"), axis.title.y=element_blank(),
    panel.grid.major.x=element_blank(), panel.grid.major.y=element_blank(),
    panel.grid.minor.x=element_blank(), panel.grid.minor.y=element_blank(), 
    panel.background=element_rect(fill="#ffffff")
  )

enter image description here

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
bgene
  • 21
  • 2

1 Answers1

2

I can't see that your data ever has any multiple entries for a subject and gene? Nothing is being overwritten, because there is nothing to overwrite.

I added a repetition of the last line, but changed mut to 1 to show. I also changed from raster to tile, and altered the opacity, so that tiles with several values would have a different colour.

If you want something like the plot you link to, you need to create shift and height vector too, like that post shows, so each tile is again subsected.

    dat[501,] = c(10,50,1)

    ggplot(dat, aes(x=subj, y=gene)) +
      geom_tile(alpha=.5,aes(fill=mut), show.legend = F) +
      scale_fill_manual(values = c("#8D1E0B","#323D8D"), na.value="transparent") +
      scale_x_discrete("Subject") +
      scale_y_continuous(breaks=1:10,
                         labels=c("D0","D1","D2","D3","D4","D5","D6","D7","D8","D9")) +
      theme(
        axis.ticks.x=element_blank(), axis.ticks.y=element_blank(),
        axis.text.x=element_blank(), axis.text.y=element_text(colour="#000000"), 
        axis.title.x=element_text(face="bold"), axis.title.y=element_blank(),
        panel.grid.major.x=element_blank(), panel.grid.major.y=element_blank(),
        panel.grid.minor.x=element_blank(), panel.grid.minor.y=element_blank(), 
        panel.background=element_rect(fill="#ffffff")
      )

enter image description here

  • Appreciate your advice! I upvoted your comment by the way, but because I have low reputation it doesn't display. – bgene Jan 25 '18 at 22:40
  • bgene: If you feel that the answer fully addresses your question then you should accept it by clicking on the check mark underneath the voting arrows. You can do that even with low reputation. (And you'll gain some reputation by doing so.) – Claus Wilke Jan 26 '18 at 00:38