Correlation between numeric and boolean variables

Question

I am creating a plot in R, using:

plot(IQ, isAtheist)
abline(lm(isAtheist~IQ))

IQ is numeric and isAtheist is boolean, having values TRUE or FALSE.

enter image description here

I have tried to write:

cor(IQ, isAtheist)

But it is gives me an error:

Error in cor(IQ, isAtheist) : 'x' must be numeric

How can I determine the correlation between these two variables?

score 5 · Answer 1 · answered Oct 20 '12 at 22:28

5

I do not really know how you want to interpret correlation in this case, but you can try cor(IQ, as.numeric(isAtheist)). In this case TRUE will be 1 and FALSE 0.

answered Oct 20 '12 at 22:28

Quentin Geissmann

2,240
1
21
36

IRTFM · Accepted Answer · 2012-10-21T15:11:41.493

This is what I think you may want (to show the differences in mean IQ value superimposed on the boxplot):

plot(IQ~isAtheist)
lines(x=c(1,2), y=predict( lm(IQ~isAtheist), 
                     newdata=list(isAtheist=c("NO","YES") ) ) ,
       col="red", type="b")

The X-position in the default of plot.formula is as.numeric(factor(isAtheist)), i.e. at 1 and 2 rather than at 0 and 1 which was what you were assuming with your use of abline. Makes no sense to extrapolate beyond those values, so I chose to plot as a bounded segment. I will add a worked example and output.

set.seed(123)
 isAtheist=factor(c("NO","YES")[1+rep( c(0,1), 50 )])
 plot(IQ~isAtheist)
     lines(x=c(1,2), y=predict( lm(IQ~isAtheist), 
                          newdata=data.frame(isAtheist=c("NO","YES") ) ) ,
            col="red", type="b")

enter image description here

Thank you! I will use your approach. – Edward Ruchevits Oct 21 '12 at 13:18 — Edward Ruchevits, Oct 21 '12 at 13:18

Correlation between numeric and boolean variables

2 Answers2