-1

Consider the following data frame:

set.seed(123)
dat1 <- data.frame(Loc = rep(c("a","b","c","d","e","f","g","h"),each = 5),
                   ID = rep(c(1:10), each = 2),
                   var1 = rnorm(200),
                   var2 = rnorm(200),
                   var3 = rnorm(200),
                   var4 = rnorm(200),
                   var5 = rnorm(200),
                   var6 = rnorm(200))
dat1$ID <- factor(dat1$ID)

I am using the RVAideMemoire package to perform permutation ANOVAs.

library(RVAideMemoire)
perm <- multtest.gp(dat1[,3:8], dat1[,1], test = "perm")

The output provides access to the mean and SE for each Loc through the list element tab:

a <- perm$tab

I would like to plot the mean for each group (geom_point) +/- the standard error, and facet them by var. What is the simplest way that I can get a into a ggplot friendly format to make this graph, and use the original labels for the plots using the original Locs names from dat1 (the columns are labeled mean.x, and SE.n)?

Ryan
  • 1,048
  • 7
  • 14

1 Answers1

0
# write the rownames of perm$tab as a variable ID in the data.frame. 
# Note: this is not the same ID as the one in the original data.frame (dat1).
df <- data.frame(ID = row.names(a), a) 

# To have a usable df to plot mean +/- se, you would need to have a data.frame in the format of:
#     ID     Loc           Mean        SE         Min         Max
#   var1       a     mean_value  se_value    mean - se  mean + se
#   ....

# there are various ways to form this. I'm using the older method of splitting the single data.frame into two sub-frame and then merge into one.

# the first sub-frame takes only the mean values and ignore the se values

# turn all the columns starting with Mean are collapse to a long format
df1 <- df %>% gather(Loc, mean, Mean.a:Mean.h) %>% select(ID, Loc, mean) 
# note the suggested approach is using pivot_longer, but gather is not going to be defunct anytime soon...     
df1 %>% head
#     ID    Loc       mean
# 1 var1 Mean.a  0.1782400
# 2 var2 Mean.a  0.1755200
# 3 var3 Mean.a  0.0097919
# 4 var4 Mean.a -0.1796800
# 5 var5 Mean.a  0.3598900
# 6 var6 Mean.a -0.2262200

df1 <- df1 %>% mutate(Loc = gsub("Mean.", "", Loc)) # to remove prefix "Mean."

 # the second sub-frame takes only the se values and ignore the mean values

df2 <- df %>% gather(Loc, se, SE.a:SE.h) %>% select(ID, Loc, se)
df2 <- df2 %>% mutate(Loc = gsub("SE.", "", Loc))
df2 %>% head

#     ID var      se
# 1 var1   a 0.22419
# 2 var2   a 0.18539   
# 3 var3   a 0.16239
# 4 var4   a 0.17894
# 5 var5   a 0.17129
# 6 var6   a 0.18997

# combine the two sub-data into usable format for plotting
plot.dat <- df1 %>% left_join(df2)

# generate the min and max variable. I'm using 1.96 x se in this case.   
plot.dat <- plot.dat %>% mutate(min = mean - 1.96*se, max = mean + 1.96*se)

plot.dat %>% head

#    ID Loc       mean      se        min       max
# 1 var1   a  0.1782400 0.22419 -0.2611724 0.6176524
# 2 var2   a  0.1755200 0.18539 -0.1878444 0.5388844
# 3 var3   a  0.0097919 0.16239 -0.3084925 0.3280763
# 4 var4   a -0.1796800 0.17894 -0.5304024 0.1710424
# 5 var5   a  0.3598900 0.17129  0.0241616 0.6956184
# 6 var6   a -0.2262200 0.18997 -0.5985612 0.1461212


# standard plotting commands with facet-Wrap
ggplot(plot.dat, aes(Loc, mean)) + 
      geom_point() + 
      geom_errorbar(aes(ymin = min, ymax=max)) + 
      facet_wrap(~ID)

enter image description here

Adam Quek
  • 6,973
  • 1
  • 17
  • 23
  • thanks for the help, that worked. I am trying to understand what you did, and had 2 questions: I take it `var` is a call that dplyr knows is supposed to mean "columns", or something similar, is this correct? Also, in my example, the `ID`s are single letters to begin with. when I try this whole procedure on my real data, where the `ID`s have longer names, `multtest.gp` shortens the names to single or double letters (after `mean.` and `SE.`, if that makes sense. How can I get the full names back in that situation? – Ryan Jun 26 '20 at 12:44
  • @Ryan I'd annotated the logic and change the `var` to `Loc` to avoid confusion. I do not understand your second question though. – Adam Quek Jun 27 '20 at 01:42