1

I am trying to find the standard deviation for my dataset groupwise (from AE to AE) which looks somewhat like this:

ID   Pay_ee  Pay_em Post
1    100      102   AE
1    105      112   RE
1    103      112   RE
1    106      123   RE
1    101      121   RE
1    109      143   AE
1    110      113   ME
1    115      132   RE
1    123      120   AE
1    100      120   AE
1    100      120   RE

I used ggplot for plotting pay_ee and pay_em. Now I am having difficulty in representing the standard deviation in my ggplot from one AE to other AE. which means I have to first calculate the standard deviation from one AE to next AE. and then plot it in my ggplot.

I tried to refer this link answer but the problem it's been done for the whole dataset.

Do you have any idea how can I do it?

Community
  • 1
  • 1
fairy tail
  • 11
  • 4
  • Which variance are you trying to calculate `Pay_ee` or `Pay_em`? – David Arenburg Aug 09 '15 at 12:06
  • @DavidArenburg For both. – fairy tail Aug 09 '15 at 12:20
  • To calculate SD for both is very easy, for example using [the devel version from GH](https://github.com/Rdatatable/data.table/wiki/Installation) of `data.table` (v >= 1.9.5) we could simply do `library(data.table) ; setDT(df)[, lapply(.SD, sd), by = .(ID, rleid(Post == "AE"))]`. How you want to plot this is a different question. – David Arenburg Aug 09 '15 at 12:33

1 Answers1

0

Using dplyr, tidyr and ggplot2 will get you what you want.

library(dplyr)
library(tidyr)
library(ggplot2)

df <- read.table(header = TRUE,
                 text = 
"ID   Pay_ee  Pay_em Post
1    100      102   AE
1    105      112   RE
1    103      112   RE
1    106      123   RE
1    101      121   RE
1    109      143   AE
1    110      113   ME
1    115      132   RE
1    123      120   AE
1    100      120   AE
1    100      120   RE")

df %>% 
  gather(key, value, starts_with("Pay_")) %>% 
  group_by(Post, key) %>%                      
  summarize(m = mean(value), 
            sd = sd(value)) %>% 
  print %>%

  ggplot(.) + 
  theme_bw() + 
  aes(x = Post, y = m, ymin = m - sd, ymax = m + sd, color = key) + 
  geom_point(position = position_dodge(width = 0.5)) + 
  geom_errorbar(position = position_dodge(width = 0.5)) + 
  ylab("Pay")

enter image description here

Peter
  • 7,460
  • 2
  • 47
  • 68