0

I've got 400 store departments and I'm running (Pearson) correlations between all the departments. How can I output the 'N' (number of cases) and the significance level (p value)?

I'm using the cor function. Here is my current code which works fine:

numprod <- ncol(data) - 2; 
matrix <- as.matrix(data[ ,2:numprod]);
AllChannels <- cbind(matrix(nrow = numprod-1,"All channels"),cor(matrix, use="all.obs", method="pearson"));

In SPSS, when you run a correlation it outputs the correlation coefficient, N and significance. This is my desired result.

Thanks all!

Lucas

Lucas
  • 401
  • 1
  • 8
  • 20

1 Answers1

0

If it's just the length of one of the vectors then use length. If you want the inferential calculations for the correlation coefficient equaling 0 then use cor.test (as the help page for ?cor tells you.) If it's the number of degrees of freedom for the test then look more closely at ?cor.test.

> cor.test(1:10,2:11)

    Pearson's product-moment correlation

data:  1:10 and 2:11 
t = 134217728, df = 8, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 1 1 
sample estimates:
cor 
  1 

The result of cor.test will be a list, so it's not going to be useful to use cbind. The Hmisc package has rcorr:

install.packages("Hmisc")
library(Hmisc)
x <- c(-2, -1, 0, 1, 2)
y <- c(4,   1, 0, 1, 4)
z <- c(1,   2, 3, 4, NA)
v <- c(1,   2, 3, 4, 5)
rcorr(cbind(x,y,z,v))
#   ========   Returns a list with three elements:
> rcorr(cbind(x,y,z,v))
  x     y     z v
x 1  0.00  1.00 1
y 0  1.00 -0.75 0
z 1 -0.75  1.00 1
v 1  0.00  1.00 1

n
  x y z v
x 5 5 4 5
y 5 5 4 5
z 4 4 5 4
v 5 5 4 5

P
  x      y      z      v     
x        1.0000 0.0000 0.0000
y 1.0000        0.2546 1.0000
z 0.0000 0.2546        0.0000
v 0.0000 1.0000 0.0000       
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Hey BondedDust. I'm a little confused about your response. What I'm after is an N and a p value for each pair of departments. See http://statistics-help-for-students.com/How_do_I_interpret_data_in_SPSS_for_Pearsons_r_and_scatterplots.htm#.U2nXjfQW3wk for an example of the 3 stats I need. Is that possible? – Lucas May 07 '14 at 06:52
  • Thanks Bonded. This looks good. What's the deal with lists? Can I turn these into a matrix? Can I combine sets of the results for different subsets and export to a csv? – Lucas May 07 '14 at 07:14
  • Each of those list elements _is_ a matrix. I fear we are descending into very basic R concepts for which you need to do some further study on your own with, say "Introduction to R" that ships with every installation of R. On my machine it's avaialbe from the help menu. – IRTFM May 07 '14 at 07:35
  • Yea I don't use R normally hence the lack of understanding. I've got to run 120 or so different correlations for different stores and customer types so I thought I'd try to hash it out in R and try to automate it. SPSS doesn't allow you to do this kinda thing. Appreciate the help. I'll go think on it a while. – Lucas May 07 '14 at 07:41
  • As always a full description of the problem with a sample dataset representing sufficient complexity is the best strategy for asking questions. – IRTFM May 07 '14 at 07:47