6

Suppose I run a bayesian simple linear regression. I would like to visualise the results by plotting multiple regression lines based on the posterior distributions of a (intercept) and b (slope). I am wondering how to display the results in a heatmap-like style or alternatively use transparency to avoid overlapping. Here's one simple ggplot approach.

library(ggplot2)
set.seed(123)

N = 1000
x = 1:80
a = rnorm(N,10,3)
b = rnorm(N,5,2)

y = vector("list",length=N)
for(i in 1:N) {y[[i]] = a[i]+b[i]*x}


df = data.frame(x=rep(x,N),y=unlist(y))
df$f = rep(1:N,each=80)

(plt <- ggplot(df, aes(x, y,group=f)) + 
  geom_jitter(alpha=1/30,width=5,col="blue") + theme_classic())

Are there better ways to do this? It would be nice if the colour would change depending on the amount of overlapping (as it is in heatmaps).

beginneR
  • 3,207
  • 5
  • 30
  • 52

2 Answers2

5

Why not do a line plot with samples from the posterior

g = ggplot(df, aes(x, y)) + 
  geom_line(alpha=1/50,col="grey",aes(group=f)) + 
  theme_classic() 

You then then add a darker line for the posterior expection

g + stat_summary(geom="line", fun.y=mean, color="black", lwd=1)

To give

enter image description here

csgillespie
  • 59,189
  • 14
  • 150
  • 185
4

Another way that you could do this is through the stat_density_2d function with ggplot2. There are a variety of ways to do this. Using your df...

As a heatmap

ggplot(df, aes(x = x, y=y))+
  stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE)+
  scale_fill_gradient(low = "blue", high = "red")+
  stat_summary(geom="line", fun.y=mean, color = "white",lwd=1)+
  theme_classic()

heatmap

Conversely, you could use points as well.

ggplot(df, aes(x = x, y=y))+
  stat_density_2d(aes(size = ..density..), geom = "point", contour = FALSE)+
  stat_summary(geom="line", fun.y=mean, color = "white",lwd=1)+
  theme_classic()

point density

mfidino
  • 3,030
  • 1
  • 9
  • 13