I have a value of chi-square = 23.426, df=3, alpha=0.05 (alpha is significance level) how can I calculate the p-value of this in r?
1 Answers
There are loads of probability distribution functions in R, see https://www.stat.umn.edu/geyer/old/5101/rlook.html
For yours you want the pchisq()
function where your p value is given by
1-pchisq(23.426, 3)
On p values and alpha
A p value is the probability of your result if you assume the null hypothesis to be true. In other words, what's the probability that the sample differs from the null by sampling error alone and the null hypothesis is true. Consider the following example; xBase is a population that does not differ from 0 (it is a standard normal distribution, the population mean is 0). We have a null hypothesis that the mean value in the population is zero and want to test that null, and (in this case) we know the null is true - normally we don't know that. We can sample 50 individuals to get a (non-zero) sample mean, and perform a t test to tell us what the probability of the null being true is given our mean and sample size. Do that 20,000 times over and using sum(pOut)
we can see that 1004 samples returned a p value less than 0.05, a false positive rate of 0.0502.
set.seed(1)
# Create a base population
xBase <- rnorm(100000,0,1)
# Repeated sampling of base population
pOut <- vector()
for(i in 1:20000){
# Sample that population
xSample <- sample(xBase, 50)
# Perform t test (storing whether p < 0.05)
pOut[i] <- 1 - pt(
(mean(xSample) - 0)/(sd(xSample)/sqrt(50)),
50 - 1) < 0.05
}
# False positive rate
mean(pOut)
Alpha simply states what value of p you regard as "statistically significant", often 0.05. This means that if we calculate p and it is greater than alpha we cannot reject the null hypothesis. Alpha is just the rate of false positives that we accept, so in the above example it was 0.05. If you decide that alpha is 0.01 then you will only reject the null if p is less than 0.01. Repeating the simulation with alpha = 0.01, there were 186 false positives (0.093).
set.seed(1)
# Create a base population
xBase <- rnorm(100000,0,1)
# Repeated sampling of base population
pOut <- vector()
for(i in 1:20000){
# Sample that population
xSample <- sample(xBase, 50)
# Perform t test (storing whether p < 0.05)
pOut[i] <- 1 - pt(
(mean(xSample) - 0)/(sd(xSample)/sqrt(50)),
50 - 1) < 0.05
}
# False positive rate
mean(pOut)
Alpha defines a cut-off, it does not affect the calculation of p, but it does affect what we conclude from p.

- 4,119
- 3
- 22
- 40
-
okay! thanks a lot. :) but the thing is, I want to calculate the p-value at this particular alpha value, which is 0.05. I know how to calculate the p-value without this alpha. but I want to calculate considering this alpha value. – Shell Apr 29 '20 at 13:55
-
1