I'm trying to combine geom_tile with geom_point to show a gene expression knockdown effect on a cardiac phenotypes and whether this effect is enhanced or reduced compared to a control:
The following code creates and example of this.
g <- read_excel( path= file.path( WORKING_DIR, "heatmap.xlsx"))
g <- data.frame( fly_gene = g [, 1], g [, 2:65])
g <- g %>%mutate_all(as.character)
g1 <- melt(g, "fly_gene", c("EDD", "ESD", "FS", "HP", "DI" , "SI" ,"Relaxtime","Peaked_median", "Minimum.velocity", "Maximum.velocity","Mad_HP","Mad_DI", "Mad_SI","AI", "tt10r_MEDIAN", "MAD.tt10r", "CO", "SV", "retrograde_speed", "anterograde_percent", "anterograde_speed"))
g2 <- melt(g, "fly_gene", c("sens_EDD", "sens_ESD", "sens_FS", "sens_HP" , "sens_DI", "sens_SI", "sens_Relaxtime", "sens_Peaked_median", "sens_minv","sens_maxv","sens_MAD_HP", "sens_MAD_DI", "sens_MAD_SI" ,"sens_AI","sens_tt10r", "sens_MAD_tt10r","sens_CO", "sens_SV", "sens_retrograde_speed", "sens_anterograde_percent", "sens_anterograde_speed" ))
g1 <- as.data.frame(unclass(g1)) g2 <- as.data.frame(unclass(g2))
g1$value=as.numeric(levels(g1$value))[g1$value]
g2$value=as.numeric(levels(g2$value))[g2$value]
g1$value = -log(g1$value)
g1$Significance_Level <- cut(g1$value,breaks = c(-0.09531018,0,2.995732,4.60517,5.298317,9.21034,Inf), label=c("NS", "NS", "", "", "","****"))
g2$sens <- cut(g2$value,breaks = c(0,1,10,Inf), label=c("Reduced_phenotypes","Enhanced_phenotypes", "Not_significant") ,right = FALSE)
g1$phenotype_sens <- g2$variable
g1$sens_value <- g2$value g1$sens <- g2$sens
plot1<- ggplot(g1, aes( variable,fly_gene )) + geom_tile(colour= "black", fill= "white", size = 0.1 ) + geom_point(aes(size = Significance_Level, colour = sens),shape = 16)+ scale_colour_manual(values = c("#649B88","#960018","NA"))+ coord_fixed()+ labs(y="",x="")+ labs() + theme_grey(base_size=7)+ theme(axis.ticks=element_line(),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),plot.background=element_blank(), panel.border=element_blank(),plot.title = element_text(color="black", size=13,hjust = 0),axis.text.x=element_text(angle=90, size = 11,hjust = 0.5, vjust = 1),axis.text.y=element_text(size = 12))
plot1
ggsave(plot1,filename="heatmap.png",height=5.5,width=8.8,units="in",dpi=200)
This is what my data looks like
My problem is that this works well if I compare a set of a few genes, that is the size of the points and the size of the tiles match. However if my data set contains many genes (40 in this case), the points look way too big compared to the tiles size, Is there a way to match the point size to the tile size?
I've attached an example of a the plot I like and the one with points that are too big
I'd appreciate any thoughts. I am a beginner so I'm sorry if my query is too basic.