1

I have the following data frame; here a sample of it. One factor year, four integer variables ep1:ep4 and one weighting variable w.

I use R v.4.0.2 and Rstudio v.1.2.5042.

I want to have the weighted means calculated for each year separately and have the weighted mean and the associated 95% confidence interval displayed in a graph (point graph).

    ep1 <- as.integer(runif(50,1,5))
    ep2 <- as.integer(runif(50,1,5))
    ep3 <- as.integer(runif(50,1,5))
    ep4 <- as.integer(runif(50,1,5))
    
    y1 <- seq(from=1980, to=2020, by=2)
    y2 <- seq(from=1980, to=2020, by=2)
    y3 <- seq(from=1980, to=2020, by=2)
    
    w <- rnorm(50, 1, 1)

    year <- c(y1,y2,y3); year <- year[-51:-64]

    df <- data.frame(year, ep1, ep2,ep3, ep4, w)
    head(df)

The graph should be something like this.

I managed to calculate the weighted mean using:

  dat <- df %>%
       group_by(year) %>%
       summarise_at(vars(ep1:ep4), list(~ weighted.mean(., w=w, na.rm=TRUE)))

This works fine when I create the plot using ggplot function from package ggplot2. I used the following:

   plot <- ggplot(dat, aes(year)) +
         geom_point(aes(y=ep1_weighted.mean), color="blue") + 
         labs(title="My title", subtitle="My subtitle", x="X-axis title", y="Y-axis title")

Everything works just fine, but I just cannot figure out how to plot the 95% CI calculated for the weighted mean on the plot.

First, I tried getting the 95% CI (the lower and upper bounds) from the t.test function, but I couldn't figure how to calculate the t.test for the weighted data. Second, I tried using the weights package to calculate the weighted t.test but I couldn't find the lower and upper bounds.

Thanks a lot for helping! (BTW: I am still a beginner in R)

0 Answers0