Extracting final p-value from output of regression (lm) in R

Question

I have following data and code:

> res = lm(vnum1~vnum2+vch1, data=rndf)
> sumres=summary(res)
> 
> sumres

Call:
lm(formula = vnum1 ~ vnum2 + vch1, data = rndf)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.48523 -0.42050  0.05919  0.43710  1.93554 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -1.0265     1.0192  -1.007   0.3310  
vnum2         1.9538     0.9665   2.022   0.0628 .
vch1B        -0.7072     0.8386  -0.843   0.4132  
vch1C         0.5502     0.8546   0.644   0.5301  
vch1D        -0.6556     0.8412  -0.779   0.4488  
vch1E         0.1461     0.8418   0.174   0.8647  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9181 on 14 degrees of freedom
Multiple R-squared:  0.2799,    Adjusted R-squared:  0.02275 
F-statistic: 1.088 on 5 and 14 DF,  p-value: 0.4088


> dput(rndf)
structure(list(vnum1 = c(-1.63272832611568, 0.225401613406123, 
-0.412759271404808, 0.0518634835165988, 0.130576187815585, 0.393254112514486, 
-0.22429939238377, -1.01640685392138, -0.5419194916071, 0.602275306119663, 
-0.378031662946265, -0.357452340621538, 0.178526276590386, -0.138016672074599, 
2.13719092448509, 1.03443214036885, 1.34821211116271, -0.718873325233001, 
1.80014304090489, -0.497878912730538), vnum2 = c(0.168299512239173, 
0.624244463164359, 0.0156862761359662, 0.450781079474837, 0.622718085534871, 
0.285390306729823, 0.911491815699264, 0.500363457249478, 0.566354847047478, 
0.942464957712218, 0.00690335803665221, 0.860874759964645, 0.786528263241053, 
0.337976476177573, 0.346998119959608, 0.549394505331293, 0.71448978385888, 
0.865091580431908, 0.967393533792347, 0.539990464225411), vch1 = structure(c(3L, 
5L, 5L, 3L, 3L, 3L, 1L, 5L, 4L, 2L, 3L, 4L, 4L, 3L, 3L, 3L, 1L, 
2L, 5L, 2L), .Label = c("A", "B", "C", "D", "E"), class = "factor")), .Names = c("vnum1", 
"vnum2", "vch1"), class = "data.frame", row.names = c(NA, -20L
))

I can get R-squared and Adjusted R-squared values from sumres$r.squared and sumres$adj.r.squared. But I am not able to get the final p-value 0.4088 from res or sumres. How can I get this value? Thanks for your help.

Jthorpe · Accepted Answer · 2015-01-19T02:13:43.007

4

you can see the code that is used to print the summary by typing

class(sumres)
#> "summary.lm"

to get the class, and then get the code for the print method by typing

stats:::print.summary.lm

into the console which includes these lines:

 cat(...lots of stuff..., "p-value:", format.pval(pf(x$fstatistic[1L], 
            x$fstatistic[2L], x$fstatistic[3L], lower.tail = FALSE), 
            digits = digits)...morestuff...)

so in this case, you want:

pf(sumres$fstatistic[1L], sumres$fstatistic[2L], sumres$fstatistic[3L], lower.tail = FALSE)

edited Jan 19 '15 at 02:13

answered Jan 19 '15 at 02:07

Jthorpe

9,756
2
49
64

Sorry, the p-value is now in the answer – Jthorpe Jan 19 '15 at 02:12
1

I am getting correct value for this data but for my real large data, the values are coming different: 7.374763e-196 while actual value is is < 2.2e-16 . Do I need to put sample size in your equation? – rnso Jan 19 '15 at 02:34
`sumres$fstatistic[2L]` and `sumres$fstatistic[3L]` are the degrees of freedom for the numerator and denominator of the F statistic. The difference is that in the print method the number printed to the console is being formatted by calling `format.pval(7.374763e-196)` – Jthorpe Jan 19 '15 at 02:38

Extracting final p-value from output of regression (lm) in R

1 Answers1

Linked