2

The graph below is a 'pseudo' gantt chart which intends to depict the duration (x-axis in calendar years) of some wars plus number of casualties . i would be grateful if you could help me to solve two issues I am facing.

1) I would like to sort the y-axis labels (variable/factor WarName) in accordance to the start date (war.start) of each war (geom_segment) within each country (facet WarLocationCountry). I would like that the war which starts the earliest is on top of the y-axis; e.g. for Sudan the ordering should be: First South Sudan, Second South Sudan, The Spla Division, Darfur.

I assume it has something to do with scale_y_discrete(rev(levels(CoW.tmp$WarLocationCountry)) but i couldn't figure out how to make it dependent on CoW.tmp$war.start.

2) geom_text adds the number of estimated casualties (sum.deaths; numeric) next to the geom_segments; these estimates include several NA / missing data. Whenever I keep them as NA i obtain the error message: Error: 'x' and 'units' must have length > 0; I thought by adding na.rm=TRUE to the geom_text part this would be resolved, but unfortunately that's not the case.

Currently the missing data are coded with 0. CoW.tmp$sum.deaths[CoW.tmp$sum.deaths==0] <- NA leads to the error when running the ggplot code.

Sorry for not formulating this question in a more general way. Many thanks for any hint.

enter image description here

Code for graph:

CoW.plot <- ggplot(CoW.tmp) + 
  geom_segment(aes(color=WarType, x=war.start, xend=war.end, y=WarName, yend=WarName), size=1) +
  geom_point(aes(shape=Outcome2, color=WarType, x=war.end,y=WarName), size=3)+
  geom_point(aes(shape=WarType, color=WarType, x=war.start,y=WarName), size=3)+
  theme(plot.title=element_text(face="bold"),
        legend.position="bottom", 
        legend.title=element_text(size=7),
        legend.text=element_text(size=5),
        legend.box="horizontal",
        axis.title.x = element_blank(),
        axis.text.x  = element_text(size=5),
        axis.title.y = element_blank(),
        axis.text.y  = element_text(size=5, face="bold"))+
  scale_color_discrete(name="War Type:",
                       breaks=c("4","5","6","7"),
                       labels=c("central control","local issues","regional internal","intercommunal"))+
  scale_shape_manual(values=c(1,3,4,5,6,7), name="Outcome:",
                       breaks=c("1","3","4","5","6","7"),
                       labels=c("victory", "compromise","transformed type of war","ongoing","stalemate","continues below war threshold"))+
  geom_text(aes(x=as.Date(conflict.end+1500), y=WarName, label=sum.deaths), size=2, na.rm=TRUE)+
  scale_x_date(limits = c(as.Date("1946-01-01"), as.Date("2010-01-01")))+
  ggtitle(paste("INTRA-STATE CONFLICTS (CoW)",a,"\n"))+
  facet_wrap(~WarLocationCountry, scales="free_y", ncol=1)

Data:

CoW.tmp<-structure(list(conflict.end = structure(c(788, -2178, -1310, 
3648, 5921, 6569, 12793, 12793, 6496, 8881, 7695, 9609, 8354, 
9876, 9876, 9876, 9876, 9876, 9876, 9876, 11271, 11271, 11271, 
11271, 11271, 11271, 11271, 11271, 11271, 13493, 14041, 14041, 
14041, 14041), class = "Date"), WarType = structure(c(2L, 1L, 
2L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L
), .Label = c("4", "5", "7"), class = "factor"), war.start = structure(c(-2284, 
-2181, -1319, 1092, 3994, 4762, 5068, 8140, 6070, 6562, 6720, 
7751, 7909, 8382, 7988, 8382, 8382, 8382, 8382, 8382, 10263, 
10263, 10263, 10263, 11085, 11088, 11088, 11088, 11088, 12109, 
13520, 13213, 13430, 13440), class = "Date"), war.end = structure(c(788, 
-2178, -1310, 3648, 5921, 6569, 7908, 12793, 6496, 8881, 7695, 
9609, 8354, 9190, 9876, 9190, 9190, 9190, 8849, 9190, 10779, 
10779, 10779, 10779, 11271, 11271, 11271, 11271, 11271, 13493, 
13667, 14031, 14041, 14041), class = "Date"), WarName = c("First South Sudan", 
"Zanzibar Arab-African", "First Uganda", "Rhodesia", "Second Uganda", 
"Matabeleland", "Second South Sudan", "Second South Sudan", "Holy Spirit Movement", 
"Inkatha-ANC", "First Somalia", "First Sierra Leone", "The SPLA Division (Dinka-Nuer) War", 
"Second Somalia", "Second Somalia", "Second Somalia", "Second Somalia", 
"Second Somalia", "Second Somalia", "Second Somalia", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Darfur", "Third Somalia", 
"Third Somalia", "Third Somalia", "Third Somalia"), Outcome2 = structure(c(3L, 
1L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 6L, 1L, 1L, 7L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 6L, 1L, 1L, 1L, 
1L), .Label = c("1", "2", "3", "4", "5", "6", "7"), class = "factor"), 
    sum.deaths = c("0", "0", "0", "11000", "46000", "0", "0", 
    "0", "7000", "0", "0", "0", "0", "70", "70", "70", "70", 
    "70", "70", "70", "0", "0", "0", "0", "0", "0", "0", "0", 
    "0", "0", "0", "0", "0", "0"), WarLocationCountry = structure(c(4L, 
    6L, 5L, 7L, 5L, 7L, 4L, 4L, 5L, 3L, 2L, 1L, 4L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 2L, 
    2L, 2L, 2L), .Label = c("Sierra Leone", "Somalia", "South Africa", 
    "Sudan", "Uganda", "Zanzibar", "Zimbabwe"), class = "factor")), .Names = c("conflict.end", 
"WarType", "war.start", "war.end", "WarName", "Outcome2", "sum.deaths", 
"WarLocationCountry"), class = "data.frame", row.names = c(34L, 
39L, 44L, 67L, 114L, 120L, 127L, 128L, 134L, 136L, 138L, 152L, 
155L, 157L, 158L, 159L, 160L, 161L, 162L, 163L, 197L, 198L, 199L, 
200L, 201L, 202L, 203L, 204L, 205L, 237L, 246L, 247L, 248L, 249L
))
zoowalk
  • 2,018
  • 20
  • 33
  • 1
    For sorting the levels (your (1)), don't try to do it inside `ggplot`. It's just looking at the order of the factor's `level`s. I think `reorder()` is the easiest way to edit that based on another variable. See `?reorder` or [this question](http://stackoverflow.com/q/2375587/903061) for more info. – Gregor Thomas Jul 08 '14 at 15:45
  • 1
    @zoowalk, You may check [**here**](http://stackoverflow.com/questions/16622979/reorder-not-correctly-reordering-a-factor-variable-in-ggplot) and [**here**](http://stackoverflow.com/questions/18816024/how-to-show-bars-in-ggplot2-in-descending-order-of-a-numeric-vector/18816504#18816504) for examples on the use of `reorder` within the `aes` call. – Henrik Jul 08 '14 at 16:00
  • +1 for including your dataset. Your example does not run though - the variable `a` in `gtitle(...)` is not defined. – jlhoward Jul 08 '14 at 16:36
  • 1
    You can solve your second problem using `label=ifelse(sum.deaths!=0,sum.deaths,"")` in the call to `geom_text(...)`. – jlhoward Jul 08 '14 at 16:37

1 Answers1

3

Something like this??

library(ggplot2)
CoW.tmp <- with(CoW.tmp,CoW.tmp[order(WarLocationCountry,-as.integer(war.start)),])
CoW.tmp$WarName <- with(CoW.tmp,factor(WarName,levels=unique(WarName)))
ggplot(CoW.tmp) + 
  geom_segment(aes(color=WarType, x=war.start, xend=war.end, y=WarName, yend=WarName), size=1) +
  geom_point(aes(shape=Outcome2, color=WarType, x=war.end,y=WarName), size=3)+
  geom_point(aes(shape=WarType, color=WarType, x=war.start,y=WarName), size=3)+
  theme(plot.title=element_text(face="bold"),
        legend.position="bottom", 
        legend.title=element_text(size=7),
        legend.text=element_text(size=5),
        legend.box="vertical",
        axis.title.x = element_blank(),
        axis.text.x  = element_text(size=10),
        axis.title.y = element_blank(),
        axis.text.y  = element_text(size=10, face="bold"))+
  scale_color_discrete(name="War Type:",
                       breaks=c("4","5","6","7"),
                       labels=c("central control","local issues","regional internal","intercommunal"))+
  scale_shape_manual(values=c(1,3,4,5,6,7), name="Outcome:",
                     breaks=c("1","3","4","5","6","7"),
                     labels=c("victory", "compromise","transformed type of war","ongoing","stalemate","continues below war threshold"))+
  geom_text(aes(x=as.Date(conflict.end+1500), y=WarName, label=ifelse(sum.deaths!=0,sum.deaths,"")), size=3, na.rm=TRUE)+
  scale_x_date(limits = c(as.Date("1946-01-01"), as.Date("2010-01-01")))+
  ggtitle(paste("INTRA-STATE CONFLICTS (CoW)","","\n"))+
  facet_wrap(~WarLocationCountry, scales="free_y", ncol=1)

Your first problem, with the ordering of the y-axis, is a bit more subtle than the comments suggest. You need the wars in reverse order of start date by country. The simplest way to do this, I think, is to re-order your whole data frame CoW.tmp by country and start date, and then reset the levels of the WarName factor to that order (first two lines of code). You cannot use -war.start in the call to the order(...) function, because unary minus does not work on dates, so we have to use -as.integer(war.start). This returns an integer representing the number of days since 1970-01-01, which we can invert.

Even this is only a partial solution. In your dataset there seem to be several duplicated records (the Second Somalia War seems to be in there multiple times, as are several others). This creates the problem with Sudan where the SPLA War starts after the first instance of Second Sudan War and before the second instance. This is why the y-axis is not ordered correctly in that case.

Your second problem, about the labels, is solved as in my comment above.

Note also that I tweaked the font sizes and set legend.box="vertical" just for the sake of an image that will display well on SO. If you're exporting to pdf or some other format you will want to change that back.

Jaap
  • 81,064
  • 34
  • 182
  • 193
jlhoward
  • 58,004
  • 7
  • 97
  • 140