1

I wanted to create a custom function to calculate confidence intervals of a column by creating two columns called lower.bound and upper.bound. I also wanted this function to be able to work within dplyr::summarize() function.

The function works as expected in all tested circumstances, but it does not when the column is named "x". When it is it draws a warning and returns NaN values. It only works when the column is specifically declared as .$x. Here is an example of the code. I don't understand the nuance... could you point me to the right direction in understanding this?

set.seed(12)

# creates random data frame
z <- data.frame(
        x = runif(100),
        y = runif(100),
        z = runif(100)
)

# creates function to calculate confidence intervals
conf.int <- function(x, alpha = 0.05) {
        
        sample.mean <- mean(x)
        sample.n <- length(x)
        sample.sd <- sd(x)
        sample.se <- sample.sd / sqrt(sample.n)
        t.score <- qt(p = alpha / 2, 
                   df = sample.n - 1, 
                   lower.tail = F)
        margin.error <- t.score * sample.se
        lower.bound <- sample.mean - margin.error
        upper.bound <- sample.mean + margin.error
        
        as.data.frame(cbind(lower.bound, upper.bound))
        
}

# This works as expected
z %>% 
        summarise(x = mean(y), conf.int(y))

# This does not
z %>% 
        summarise(x = mean(x), conf.int(x))

# This does 
z %>% 
        summarise(x = mean(x), conf.int(.$x))

Thanks!

1 Answers1

2

This is a "feature" in dplyr which makes the updated value of x (which has the mean value) is available when you pass it to conf.int function.

Possible options are -

  1. Change the name of the variable to store the mean value
library(dplyr)

z %>% summarise(x1 = mean(x), conf.int(x))

#         x1 lower.bound upper.bound
#1 0.4797154   0.4248486   0.5345822
  1. Change the order
z %>% summarise(conf.int(x), x = mean(x))

#  lower.bound upper.bound         x
#1   0.4248486   0.5345822 0.4797154
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213