1

With the data I have, this R code x <- t.test(Age ~ Completers, var.equal = TRUE, data = data) renders the following result:

    Two Sample t-test

data:  Age by Completers
t = 0.93312, df = 1060, p-value = 0.351
alternative hypothesis: true difference in means between group Completers and group Non Completers is not equal to 0
95 percent confidence interval:
 -0.5844018  1.6442118
sample estimates:
    mean in group Completers mean in group Non Completers
                    37.16052                     36.63062

What I would like is to plot each mean (found in x$estimate[1] and x$estimate[2]) with its own point on the x axis at its proper height on the y axis (on the same graph) and each point complemented with the same confidence interval (CI) (found in x$conf.int[1] and x$conf.int[2]). Like this[*]:

enter image description here

Unfortunately, if I'm not mistaken, plot() (from the Generic X-Y Plotting) does not seem to handle this. So I tried with plotCI (from gplots) as follows:

library(gplots)

plotCI(x = x$estimate[1], y = x$estimate[2], 
       li = x$conf.int[1], ui = x$conf.int[2])

But it renders as shown below:

enter image description here

My questions:

  • Is there a way to obtain a plot such as in the first graph with Base R code?
  • If not, what would be the solution (short of using the jmv:: code (see [*]))?

EDIT

As requested in the comments, please find hereunder some code that help reproduce the data (T-Test results are won't be exactly the same as above, but the idea is the same):

# Generate random numbers with specific mean and standard deviation
completers <- data.frame(Completers = 1, 
                         Age = rnorm(100, mean = 37.16052, sd = 8.34224))
nonCompleters <- data.frame(Completers = 0, 
                            Age = rnorm(100, mean = 36.63062, sd = 11.12173))

# Convert decimaled number to integers
completers[] <- lapply(completers, as.integer)
nonCompleters[] <- lapply(nonCompleters, as.integer)

# Stack data from 2 different data frames
df <- rbind(completers, nonCompleters)

# Remove useless data frames
rm(completers, nonCompleters)

# Age ~ Completers (T-Test)
(tTest <- t.test(df$Age ~ df$Completers, var.equal = TRUE))

Sources:


[*] Graph obtained with Jamovi Version 2.3.15.0 which uses the following code (but I would like to avoid using jmv::):

jmv::ttestIS(
  formula = Age ~ Completers,
  data = data,
  plots = TRUE
)

System used:

  • R 4.2.1
  • RStudio 2022.07.1 Build 554
  • macOS Monterey Version 12.5.1 (Intel)
pdeli
  • 436
  • 3
  • 13

1 Answers1

1

There appears to be a misalignment of what you want and what t.test() is giving you. t.test() let you know if there is a difference in means, and report the CI of the difference in sample means (not the CIs of the individual means).

Since you stated you want the CIs of the individual means using base R, you can accomplish this by:

Sample data

nn <- 100
df <- data.frame(Completers = rep(c(1,0), each = nn),
                 Age = c(as.integer(rnorm(nn, mean = 37.16052, sd = 8.34224)),
                         as.integer(rnorm(nn, mean = 36.63062, sd = 11.12173))))

With the raw data, calculate the summary statistics and confidence interval:

# Base R - find summary statistics and restructure into data frame
df_summary <- aggregate(Age ~ Completers, df, function(x) c(mean = mean(x), 
                                                            sd = sd(x), 
                                                            median = median(x), 
                                                            n = length(x)))
df_summary <- data.frame(Completers = df_summary[, 1], df_summary$Age) #reformat nested matrix

# Calculate 95% CI
alpha <- 0.05/2

# Lower CI
df_summary$ci_low <-
  df_summary$mean - qt(1 - alpha, df = df_summary$n) * df_summary$sd /
  sqrt(df_summary$n)

# Upper CI
df_summary$ci_hi <-
  df_summary$mean + qt(1 - alpha, df = df_summary$n) * df_summary$sd /
  sqrt(df_summary$n)

# Output

#  Completers  mean        sd median   n   ci_low    ci_hi
#1          0 34.94 10.730698     34 100 32.81106 37.06894
#2          1 37.43  7.645234     37 100 35.91321 38.94679


Now you can plot the mean and CI for each group (your example also mentioned you wanted the median in there):

# Set Y limits (change to whatever)
ylimits <- c(min(df_summary$ci_low) - 1,
             max(df_summary$ci_hi) + 1)

# Plot
plot(NA, xlim = c(0,3), ylim = ylimits, # blank plot
     axes = FALSE, xlab = "", ylab = "")
segments(x0 = c(1,2), y0 = df_summary$ci_low, y1 = df_summary$ci_hi) # add segments
points(df_summary$mean, pch = 19) # add means
points(df_summary$median, pch = 0)
axis(1, at = 0:3, labels = c(NA, "Completers", "Noncompleters", NA)) # add x axis
axis(2) #add y axis
mtext(side = 1, "Completers", padj = 4) # add x label 
mtext(side = 2, "Age", padj = -4) # add y label
legend("topleft", c("Mean", "Median", "95% CI"),
       pch = c(19, 0, NA), lty = c(NA, NA, 1), bty = "n")

Output: enter image description here

jpsmith
  • 11,023
  • 5
  • 15
  • 36
  • Thank you very much for your answer. Unfortunately, I fail to generalise your solution to the data I have to extract from the ```t.test()```. Now that I added some `R` code that allow reproduction of the results, perhaps you could adapt your answer? (I realise that I should have put reproducible data immediately. Sorry for that.) – pdeli Sep 05 '22 at 16:55
  • I'm a bit confused why you want to use `tTest$conf.int` for the CI for both, this is the CI for the difference between mean_0 and mean_1 (sample difference in means, or ` tTest$estimate[1] - tTest$estimate[2]`), not the CI for the individual means. Are you wanting the CI for the individual means? – jpsmith Sep 06 '22 at 01:42
  • Thank you for your comment. Yes, what I want are the CI for the individual means (if possible). In effect, what I would like is the equivalent of the plot provided by the ```jmv::ttestIS()``` code, but without using that specific code, rather, the base ```R``` code, if possible. If not I'll go with additional packages too. – pdeli Sep 06 '22 at 09:18
  • Please see edits to get the CI for individual means – jpsmith Sep 06 '22 at 12:36
  • Thank you very much. It works like a charm. I have to ask the question though: would there have been an easier/shorter way of having the same result with some additional packages? – pdeli Sep 07 '22 at 15:41