1

I would really appreciate some insight on the zagging when using the following code in R:

tbi_military %>% 
ggplot(aes(x = year, y = diagnosed, color = service)) +
geom_line() +
facet_wrap(vars(severity))

The dataset is comprised of 5 variables (3 character, 2 numerical). Any insight would be so appreciated.

enter image description here

jgrabow1
  • 13
  • 2
  • 1
    Please provide a sample of your data with `dput()` – Vinícius Félix Aug 30 '21 at 16:06
  • 2
    This pattern typically appears when you have an incomplete group specification. Likely, you have an additional grouping variable in your data that separates something within services per year. You should include it in `aes(group = interaction(service, {additional_grouping_variable}))`. – teunbrand Aug 30 '21 at 16:19
  • Please provide enough code so others can better understand or reproduce the problem. – Community Sep 01 '21 at 12:40

1 Answers1

3

This is just an illustration with a standard dataset. Let's say we're interested in plotting the weight of chicks over time depending on a diet. We would attempt to plot this like so:

library(ggplot2)

ggplot(ChickWeight, aes(Time, weight, colour = factor(Diet))) +
  geom_line()

You can see the zigzag pattern appear, because per diet/time point, there are multiple observations. Because geom_line sorts the data depending on the x-axis, this shows up as a vertical line spanning the range of datapoints at that time per diet.

The data has an additional variable called 'Chick' that separates out individual chicks. Including that in the grouping resolves the zigzag pattern and every line is the weight over time per individual chick.

ggplot(ChickWeight, aes(Time, weight, colour = factor(Diet))) +
  geom_line(aes(group = interaction(Chick, Diet)))

If you don't have an extra variable that separates out individual trends, you could instead choose to summarise the data per timepoint by, for example, taking the mean at every timepoint.

ggplot(ChickWeight, aes(Time, weight, colour = factor(Diet))) +
  geom_line(stat = "summary", fun = mean)

Created on 2021-08-30 by the reprex package (v1.0.0)

teunbrand
  • 33,645
  • 4
  • 37
  • 63