5

I have been using the broom package's tidy() function in R to print my model summaries.

However, the tidy() function returns p-values without stars, which makes it a bit weird for many people who are used to seeing stars in model summaries.

Does anyone know a way to add stars to the output?

iBug
  • 35,554
  • 7
  • 89
  • 134
bcdeniz
  • 69
  • 1
  • 5

3 Answers3

10

We can use a convenient function stars.pval from gtools to do this

library(gtools)
library(broom)
library(dplyr)
data(mtcars)
mtcars %>%
   lm(mpg ~ wt + qsec, .) %>%
   tidy %>%
   mutate(signif = stars.pval(p.value))
#        term  estimate std.error  statistic      p.value signif
#1 (Intercept) 19.746223 5.2520617   3.759709 7.650466e-04    ***
#2          wt -5.047982 0.4839974 -10.429771 2.518948e-11    ***
#3        qsec  0.929198 0.2650173   3.506179 1.499883e-03     **
akrun
  • 874,273
  • 37
  • 540
  • 662
4

This is not really the purpose of tidy. It is used to make tidy data frames from various objects, not to provide additional metrics about those objects.

You could always write a function to generate stars based on p-values and add a column to the data frame generated using tidy. For example:

make_stars <- function(pval) {
  stars = ""
  if(pval <= 0.001)
    stars = "***"
  if(pval > 0.001 & pval <= 0.01)
    stars = "**"
  if(pval > 0.01 & pval <= 0.05)
    stars = "*"
  if(pval > 0.05 & pval <= 0.1)
     stars = "."
  stars
}

Then something like:

library(broom)
library(dplyr)

mtcars %>% 
  lm(mpg ~ wt + qsec, .) %>% 
  tidy() %>% 
  mutate(signif = sapply(p.value, function(x) make_stars(x)))

         term  estimate std.error  statistic      p.value signif
1 (Intercept) 19.746223 5.2520617   3.759709 7.650466e-04    ***
2          wt -5.047982 0.4839974 -10.429771 2.518948e-11    ***
3        qsec  0.929198 0.2650173   3.506179 1.499883e-03     **
neilfws
  • 32,751
  • 5
  • 50
  • 63
1

As used by the printCoefmat function in R, you can also use the symnum function from the stats package (included in base r):

pv <- c(0.00001, 0.002, 0.02, 0.06, 0.12, 0.99)

stars <- symnum(pv, corr = FALSE, na = FALSE, 
       cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), 
       symbols = c("***", "**", "*", ".", " "))

# fetch the stars only
as.character(stars)
#> [1] "***" "**"  "*"   "."   " "   " "

# fetch the legend description
attr(stars, "legend")
#> [1] "0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1"

Created on 2018-09-10 by the reprex package (v0.2.0).

Or to precisely answer your question, you can use it like so

library(dplyr)

pv <- c(0.00001, 0.002, 0.02, 0.06, 0.12, 0.99)

star_function <- function(x) {
  symnum(x, corr = FALSE, na = FALSE, 
         cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), 
         symbols = c("***", "**", "*", ".", " "))
}
stars <- star_function(pv)

# fetch the stars only
as.character(stars)
#> [1] "***" "**"  "*"   "."   " "   " "

# fetch the legend description
attr(stars, "legend")
#> [1] "0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1"

mtcars %>%
  stats::lm(mpg ~ wt + qsec, .) %>%
  broom::tidy(.) %>% 
  mutate(sign = as.character(star_function(p.value)))
#> # A tibble: 3 x 6
#>   term        estimate std.error statistic  p.value sign 
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl> <chr>
#> 1 (Intercept)   19.7       5.25       3.76 7.65e- 4 ***  
#> 2 wt            -5.05      0.484    -10.4  2.52e-11 ***  
#> 3 qsec           0.929     0.265      3.51 1.50e- 3 **

Created on 2018-09-10 by the reprex package (v0.2.0).

David
  • 9,216
  • 4
  • 45
  • 78