0

Statistical tests in R generate lists, but then when you call the test, the printing of these lists gives a special user-friendly structure to assist the reader. To see what I'm talking about, consider an example where you use the t.test function in the stats package.

#Run a T-test on some example data
X <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39);
Y <- c(19, 20, 44, 45, 8, 29, 26, 59, 35, 50);
TEST <- stats::t.test(X,Y);

#Show structure of the TEST object
str(TEST);
List of 9
 $ statistic  : Named num -0.134
  ..- attr(*, "names")= chr "t"
 $ parameter  : Named num 10.2
  ..- attr(*, "names")= chr "df"
 $ p.value    : num 0.896
 $ conf.int   : num [1:2] -12.3 10.9
  ..- attr(*, "conf.level")= num 0.95
 $ estimate   : Named num [1:2] 32.8 33.5
  ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
 $ null.value : Named num 0
  ..- attr(*, "names")= chr "difference in means"
 $ alternative: chr "two.sided"
 $ method     : chr "Welch Two Sample t-test"
 $ data.name  : chr "X and Y"
 - attr(*, "class")= chr "htest"

This object is a list with nine elements, some of which are named via attributes. However, when I print the TEST object, the returned information is structured in a different way than the standard printing of a list.

#Print the TEST object
TEST;

        Welch Two Sample t-test

data:  X and Y
t = -0.13444, df = 10.204, p-value = 0.8957
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.27046  10.87046
sample estimates:
mean of x mean of y 
     32.8      33.5 

As you can see, this printed output is much more user-friendly than the standard printing for a list. I would like to be able to program statistical tests in R which generate a list of outputs similar to the above, but which print in this user-friendly way.


My Questions: Why does R print the output of the list TEST in this special way? If I create a list of outputs of a statistical test (e.g., like the above), how can I set the object to print in this way?

M--
  • 25,431
  • 8
  • 61
  • 93
Ben
  • 1,051
  • 8
  • 26

2 Answers2

4

Use one of the methods below that meets your needs the best.

X <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39)
Y <- c(19, 20, 44, 45, 8, 29, 26, 59, 35, 50)
TEST <- stats::t.test(X,Y)

#default; printing data of htest class
print(TEST) 

#printing every element of the list
lapply(TEST, print) 
print.listof(TEST)

#printing the results as a dataframe
broom::tidy(TEST) #output of this one is included just for illustration


    # A tibble: 1 x 10
  estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high method                  alternative
     <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl> <chr>                   <chr>      
1     -0.7      32.8      33.5    -0.134   0.896      10.2    -12.3      10.9 Welch Two Sample t-test two.sided 

To address OP's follow-up question:

"Each" class of data has a method of printing. As I outlined in my answer, print function looks at TEST and as it is class of htest it uses print.htest.

class(TEST)
# [1] "htest"

head(methods(print))
# [1] "print.acf"         "print.AES"         "print.all_vars"    "print.anova"
# [5] "print.anova.lme"   "print.ansi_string"

In my freshly opened R session, I have 185 different methods. As you loads libraries, the number will go higher.

If you want to dig deeper, then you need to look at the source code of print which can be found here: R source code on GitHub

M--
  • 25,431
  • 8
  • 61
  • 93
  • 1
    Thanks for your answer. While this is certainly very interesting, it does not explain why ```R``` prints the T-test the way it does by default, or how this can be emulated on another list. – Ben May 28 '19 at 02:00
  • 1
    @Ben It does. "each" class of data has a method of printing. As I outlined in my answer, `print` looks at `TEST` and as it is class of `htest` it uses [`print.htest`](https://www.rdocumentation.org/packages/EnvStats/versions/2.3.1/topics/print.htest). You can check class of your data `class(TEST)` and see various methods of printing `methods(print)`. If you want to dig deeper, the you need to look at the source code of `print` which can be found here: https://github.com/wch/r-source/blob/trunk/src/library/base/R/print.R – M-- May 28 '19 at 02:17
0

This answer is put together from helpful comments and answers by other users, but I wanted to give an elaborated answer here to make things more explicit, for the benefit of users who are not already familiar with some of these issues. The object created by the t.test function is an object of class htest, and this type of object has a special method of printing under the print.htest setting in the global environment. That printing method draws out information from the list, but prints it in the user-friendly way you see in the output in the question.

If you want to replicate this type of printing for a new statistical test that you are programming yourself, then you will need to structure your new test so that it outputs a htest object, with the required elements of the list, and the required class. Here is an example from another answer where a hypothesis test set out in Tarone (1979) is programmed as a htest object:

Tarone.test <- function(N, M) {

    #Check validity of inputs
    if(any(M > N)) { stop("Error: Observed count value exceeds binomial trials"); }

    #Set hypothesis test objects
    method      <- "Tarone's Z test";
    alternative <- "greater";
    null.value  <- 0;
    attr(null.value, "names") <- "dispersion parameter";
    data.name   <- paste0(deparse(substitute(M)), " successes from ", 
                          deparse(substitute(N)), " counts");

    #Calculate test statistics
    estimate    <- sum(M)/sum(N);
    attr(estimate, "names") <- "proportion parameter";

    S           <- sum((M - N*estimate)^2/(estimate*(1 - estimate)));
    statistic   <- (S - sum(N))/sqrt(2*sum(N*(N-1))); 
    attr(statistic, "names") <- "z";

    p.value     <- 2*pnorm(-abs(statistic), 0, 1);
    attr(p.value, "names") <- NULL;

    #Create htest object
    TEST        <- list(statistic = statistic, p.value = p.value, estimate = estimate, 
                        null.value = null.value, alternative = alternative, 
                        method = method, data.name = data.name);
    class(TEST) <- "htest";

    TEST; }

In this example, the function calculates all the required elements of the htest object and then creates this object as a list with that class. It is important to include the command class(TEST) <- "htest" in the code, so that the object created is not just a regular list. Inclusion of that command will ensure that the output object is of the proper class, and so it will print in a user-friendly way. To see this, we can generate some data and apply the test:

#Generate example data
N <- c(30, 32, 40, 28, 29, 35, 30, 34, 31, 39);
M <- c( 9, 10, 22, 15,  8, 19, 16, 19, 15, 10);

#Apply Tarone's test to the example data
TEST <- Tarone.test(N, M);
TEST;

        Tarone's Z test

data:  M successes from N counts
z = 2.5988, p-value = 0.009355
alternative hypothesis: true dispersion parameter is greater than 0
sample estimates:
proportion parameter 
           0.4359756

Here we see that our newly created hypothesis-testing function gives us output that has a similar user-friendly structure to the t.test. In this example we have given different names to the testing method and the elements of the test, and these appear in the descriptive output when printed.

Ben
  • 1,051
  • 8
  • 26
  • This is not my downvote but I understand why is that. You are making a so called custom test (`MY.TEST`) which has exact variables (outputs) as `t.test`. Don't you see it? You just made `t.test` again but now, instead of making the output structure automatically, you put them into a `htest` manually. What's the point of that? If you can think of a statistical test that later its result can be used as inputs for your function, would you present it as an example in your answer, so you can avoid further downvotes. Honestly, I twisted my own arm to not click on that down arrow. Cheers. – M-- May 28 '19 at 20:34
  • 1
    It doesn't need to have the same output variables as ```t.test```. It could be an entirely different hypothesis test. I have already given an example of a different test at the stated link (given again [here](https://stats.stackexchange.com/questions/409490/how-do-i-carry-out-a-significance-test-with-tarones-z-statistic/410376#410376)). The point is to allow you to create a different kind of hypothesis test, but still frame it as a ```htest``` object, so that it prints nicely. (Those outputs are the mandatory outputs for a ```htest``` object. They are not specific to the ```t.test```.) – Ben May 28 '19 at 22:54
  • Well I would include some of the info from the link you shared again with me in the question (not as a link) so folks who are hasting towards end of their day will gain some more insight ;) – M-- May 28 '19 at 22:57