0

I have a regression of Observed and estimated (Est) values as seen in the head below.

data <- structure(list(IndID = structure(c(1L, 2L, 3L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 
22L, 23L), .Label = c("CAL_F01", "CAL_F17", "CAL_F19", "CAL_F23", 
"CAL_F43", "CAL_M33", "CAL_M36", "COL_P01", "COL_P03", "COL_P05", 
"COL_P06", "COL_P07", "COL_P08", "COL_P09", "COL_P10", "COL_P12", 
"COL_P13", "PAT_F03", "PAT_F04", "PAT_F05", "PAT_M02", "PAT_M03", 
"PAT_M04"), class = "factor"), StudyArea = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 3L), .Label = c("Cali", "Colo", "Pata"), class = "factor"), 
    Observed = c(22L, 50L, 8L, 54L, 30L, 11L, 90L, 53L, 9L, 42L, 
    72L, 40L, 60L, 58L, 20L, 37L, 50L, 67L, 20L, 19L, 58L, 5L
    ), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "PredKills", class = "factor"), 
    Est = c(28, 52, 6, 35, 31, 13, 80, 62, 4, 43, 66, 43, 55, 
    42, 20, 47, 36, 84, 20, 17, 36, 6), SE = c(3.50031581162016, 
    4.8742514125436, 1.20589766104628, 4.79430832229519, 3.87541734990744, 
    2.36031827307993, 6.35148447967163, 5.52456747941261, 1.60267125934065, 
    4.53967516735091, 6.61559705260502, 5.35175112687543, 5.89582419295991, 
    5.18042529534246, 3.43767468948519, 4.69809433696684, 3.80733165582324, 
    5.85520173339347, 3.151903629499, 2.64621136787301, 4.64130814363024, 
    1.41537000011436)), .Names = c("IndID", "StudyArea", "Observed", 
"variable", "Est", "SE"), row.names = c(1L, 2L, 3L, 5L, 6L, 7L, 
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 
21L, 22L, 23L), class = "data.frame")


> head(data)
    IndID StudyArea Observed  variable Est       SE
1 CAL_F01      Cali       22 PredKills  28 3.500316
2 CAL_F17      Cali       50 PredKills  52 4.874251
3 CAL_F19      Cali        8 PredKills   6 1.205898
5 CAL_F43      Cali       54 PredKills  35 4.794308
6 CAL_M33      Cali       30 PredKills  31 3.875417
7 CAL_M36      Cali       11 PredKills  13 2.360318

This code makes the plot below.

p2 <- ggplot(data, aes(x=Observed, y=Est, color=StudyArea))
p2+ geom_point(shape="*", size = 12) + 
  geom_abline(intercept =0, slope = 1, size = 1)+
  geom_errorbar(aes(x= Observed, ymin=Est-SE, ymax=Est+SE, color=StudyArea),width = 0.5,cex=1, lty=2)+
  scale_color_manual(values=c("red","blue","darkgreen"))+
  ylim(2,92)+ xlim(2,92)

fig

the solid line represents a one to one fit. i.e. if the Obs = Est then points will fall on the line. The residual from the line is obviously the error.

Question: How do I add a shaded region to the 45 degree line that represents 10% error. I think I need to use geom_ribbon as suggested at this SO post but have not been able to produce the correct result.

ADDITION the shaded region should not vary as a function of either the x or y axis , but should be constant over the 45 degree line.

I want to keep the x-axis values the same and by changing the y-axis values shade a region where y is 0.9 greater than the Est and 0.9 lower than Est.

Here is the code I have been working with to add an additional line that is 0.9 lower than Est.

p2+ geom_point(shape="*", size = 12) + 
  geom_errorbar(aes(x= ObsKills, ymin=value-SE, ymax=value+SE, color=StudyArea),width = 0.5,cex=1, lty=2)+
  coord_cartesian(ylim=c(2,92), xlim=c(2,92))+
  scale_color_manual(values=c("red","blue","darkgreen"))+
  geom_abline(intercept =0, slope = 1, size = 1, col="red")+
  geom_abline(data=data.frame(x=seq(1,92,1),y=seq(1,92,1)), aes(x=x, y=y*0.9),lty=2, cex=1)

With the last line of code (taking hints from @BrodieG) I create a new data.frame with a seq of values that mirror the Observed and Est data. In the aes function I mult y by 0.9. In my mind this line should be 0.9 lower than Est rather than on top.

My hope was to add a lower and upper line and then shade between them, although there is likely a better way to do this.

Hope this is a bit more clear...

Thanks in advance!

Community
  • 1
  • 1
B. Davis
  • 3,391
  • 5
  • 42
  • 78

1 Answers1

1

Here is an implementation, though I'm not sure I'm doing exactly what you want:

p2 <- ggplot(data, aes(x=Observed, y=Est, color=StudyArea))
p2+ 
  geom_ribbon(data=data.frame(x=c(0,100)), aes(x=x, ymin=x * .9, ymax=x * 1.1), fill="gray", inherit.aes=F, alpha=0.5) +
  geom_abline(intercept =0, slope = 1, size = 1)+
  geom_point(shape="*", size = 12) + 
  geom_errorbar(aes(x= Observed, ymin=Est-SE, ymax=Est+SE, color=StudyArea),width = 0.5,cex=1, lty=2)+
  scale_color_manual(values=c("red","blue","darkgreen"))+
  coord_cartesian(ylim=c(2,92), xlim=c(2,92))

enter image description here

BrodieG
  • 51,669
  • 9
  • 93
  • 146
  • I added ADDITION above for clarification. – B. Davis Feb 20 '14 at 19:24
  • @B.Davis, just change the `* .9` and `* 1.1` to `+- x`, where `x` is whatever band width you want. I don't know how to interpret your "10% error"; should it be 10% of the mean? – BrodieG Feb 20 '14 at 20:03
  • I have added more detail that I hope is clear. Thanks in advance. – B. Davis Feb 20 '14 at 20:48
  • I have been working through your initial post and think that is correct. I apologizes for my confusion and thanks for you help! – B. Davis Feb 20 '14 at 21:31