1

It's probably a very easy question.

I can't find the methodology behind the pvalue calculation in the cor.test() function in R.

DataAdventurer
  • 300
  • 1
  • 3
  • 10
  • The [reference](https://www.jstor.org/stable/2347111?seq=1#page_scan_tab_contents) is provided in `help("cor.test")`. You can also study the source code of `stats:::cor.test.default`. – Roland Jul 31 '17 at 09:16

1 Answers1

6

Here is the code that calculates the p-value of Pearson's correlation:

x <- c(44.4, 45.9, 41.9, 53.3, 44.7, 44.1, 50.7, 45.2, 60.1)
y <- c( 2.6,  3.1,  2.5,  5.0,  3.6,  4.0,  5.2,  2.8,  3.8)
ct = cor.test(x, y, method = "pearson")
ct$p.value  ## this is what cor.test() gives

n <- length(x)
r <- cor(x, y)
df <- n - 2
t = sqrt(df) * r/sqrt(1 - r^2)
pval = 2 * min(pt(t, df), pt(t, df, lower.tail = FALSE))  ## this is calculated manually

ct$p.value == pval
AK88
  • 2,946
  • 2
  • 12
  • 31
  • Hi AK88, thank you for your answer. What I'm looking for is the formula behind $p.value. – DataAdventurer Jul 31 '17 at 10:15
  • 1
    The formula behind `p.value` is -- first find the `t-statistic` using the formula `sqrt(df) * r/sqrt(1 - r^2)`. And then extract upper and lower tail probabilities of this `t-statistic`. Take the minimum value from these probabilities and multiply by 2, as we have to take both tails of the distribution. Is this clear? – AK88 Jul 31 '17 at 10:24
  • can you provide me a source for the formula? I can't find it on the internet. – DataAdventurer Jul 31 '17 at 13:06
  • 1
    Sure, have a look at this topic: https://stats.stackexchange.com/questions/120199/calculate-p-value-for-the-correlation-coefficient – AK88 Jul 31 '17 at 13:11