I want to make a scatterplot matrix with points in upper pane and r or r2 values in lower pane, as described here: http://www.sthda.com/english/wiki/scatter-plot-matrices-r-base-graphs
When there is no missing data, it works fine. But when there are some missing values, it seems unable to calculate R, even when I use code I thought would account for missing values. See commented-out lines in the code below, which show what I've tried -- those attempts were passed on what I found after searching about here on StackOverflow: Dealing with missing values for correlations calculation
Probably something simple, as I'm a pretty simple R user (so I'm hoping for solutions that are more simple than elegant). Talk to me like I'm stupid!
I do not want to remove whole rows just because there is one missing value, as my real dataset (not this example) is rather small.
# --------------------------------------
# Create Dataframes, one with missing values
# --------------------------------------
Alx <- c(13, 9, 5, 17, 2, 8, 11, 4)
Bex <- c(23, 41, 32, 58, 26, 33, 51, 46)
Dex <- c(7,10,6,4,19,6,15,16)
Gax <- c(43,54,31,28,60,30,43,21)
AlxM <- c(NA, 9, 5, 17, 2, 8, 11, 4)
BexM <- c(23, 41, NA, 58, 26, 33, 51, 46)
DexM <- c(7,10,6,4,19,6,15,NA)
GaxM <- c(43,54,31,28,60,30,43,21)
df <- data.frame(Alx,Bex,Dex,Gax) # dataframe that works in scatterplot matrix
df_miss <- data.frame(AlxM,BexM,DexM,GaxM)# dataframe that has missing values
rm(Alx,Bex,Dex,Gax,AlxM,BexM,DexM,GaxM) # removing un-needed garbage
# --------------------------------------
# --------------------------------------
# Scatterplot Matrix - functions for upper and lower
# panels, it is the line "r <- round(cor(x,y), digits=2)"
# that I've been focusing on. Perhaps the wrong approach?
# see: http://www.sthda.com/english/wiki/scatter-plot-matrices-r-base-graphs
# --------------------------------------
# Upper panel
upper.panel<-function(x, y){
points(x,y, pch=19)
r <- round(cor(x,y), digits=2)
txt <- paste0("R = ", r)
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
text(0.5, 0.9, txt)
}
# Correlation panel
panel.cor <- function(x, y){
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- round(cor(x, y), digits=2) # gives all NA
# Neither of these (immediately below) worked for me:
# see: https://stackoverflow.com/questions/7445639/dealing-with-missing-values-for-correlations-calculation
# r <- round(cor(na.omit(x, y)), digits=2) # does not work
# r <- round(cor(x, y), use="pairwise.complete.obs", digits=2) # does not work
txt <- paste0("R = ", r)
cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = 0.5)
}
# Scatterplots
pairs(df[,1:4], lower.panel = panel.cor,
upper.panel = upper.panel)
pairs(df_miss[,1:4], lower.panel = panel.cor,
upper.panel = upper.panel)
# --------------------------------------