1

I need to visualize and compare the difference in two equally long sales periods. 2018/2019 and 2019/2020. Both periods begin at week 44 and end at week 36 of the following year. If I create a graph, both periods are continuous and line up. If I use only the week number, the values ​​are sorted as continuum and the graph does not make sense. Can you think of a solution?

Thank You

Data:

set.seed(1)
df1 <- data.frame(sells = runif(44),
                  week = c(44:52,1:35),
                  YW = yearweek(seq(as.Date("2018-11-01"), as.Date("2019-08-31"), by = "1 week")),
                  period = "18/19")

df2 <- data.frame(sells = runif(44),
                  week = c(44:52,1:35),
                  YW = yearweek(seq(as.Date("2019-11-01"), as.Date("2020-08-31"), by = "1 week")),
                  period = "19/20")

# Yearweek on x axis, when both period are separated

ggplot(df1, aes(YW, sells)) +
  geom_line(aes(color="Period 18/19")) + 
  geom_line(data=df2, aes(color="Period 19/20")) +
  labs(color="Legend text")

# week on x axis when weeks are like continuum and not splited by year
ggplot(df1, aes(week, sells)) +
  geom_line(aes(color="Period 18/19")) + 
  geom_line(data=df2, aes(color="Period 19/20")) +
  labs(color="Legend text")
Eischias
  • 25
  • 5

3 Answers3

3

Another alternative is to facet it. This'll require combining the two sets into one, preserving the data source. (This is commonly a better way of dealing with it in general, anyway.)

(I don't have tstibble, so my YW just has seq(...), no yearweek. It should translate.)

ggplot(dplyr::bind_rows(tibble::lst(df1, df2), .id = "id"), aes(YW, sells)) +
  geom_line(aes(color = id)) +
  facet_wrap(id ~ ., scales = "free_x", ncol = 1)

faceted ggplot2

In place of dplyr::bind_rows, one might also use data.table::rbindlist(..., idcol="id"), or do.call(rbind, ...), though with the latter you will need to assign id externally.

One more note: the default formatting of the x-axis is obscuring the "year" of the data. If this is relevant/important (and not apparent elsewhere), then use ggplot2's normal mechanism for forcing labels, e.g.,

... +
  scale_x_date(labels = function(z) format(z, "%Y-%m"))

faceted ggplot2 with updated x-axis labels

While unlikely that you can do this without having tibble::lst available, you can replace that with list(df1=df1, df2=df2) or similar.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Yes, I think this is nicest on the eye and has the clearest x axis labelling. – Allan Cameron Dec 14 '20 at 13:29
  • I agree ... and I know there are times when direct overlap is desired in order to see minute vertical differences between the lines. We lose that dimension a little with this, though macro differences should be evident-enough given the same y-axis limits. – r2evans Dec 14 '20 at 13:33
  • 1
    Oops, `tibble::lst`, thanks. Fixed/noted. – r2evans Dec 14 '20 at 13:37
  • 1
    @r2evans Thanks, your solution is visually the cleanest, but in the nature of the data and the customer =D I have to use a comparison in one graph to clearly see the difference in sales periods – Eischias Dec 14 '20 at 13:40
  • I thought that might be the case. – r2evans Dec 14 '20 at 13:41
2

If you want to keep the x axis as a numeric scale, you can do:

ggplot(df1, aes((week + 9) %% 52, sells)) +
  geom_line(aes(color="Period 18/19")) + 
  geom_line(data=df2, aes(color="Period 19/20")) +
  scale_x_continuous(breaks = 1:52,
                     labels = function(x) ifelse(x == 9, 52, (x - 9) %% 52), 
                     name = "week") +
  labs(color="Legend text")

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • I'm working with your solution, but I noticed one thing. The x-axis shows 0 insted 52. Still not able figure out the solution. – Eischias Dec 14 '20 at 14:15
1

Try this. You can format your week variable as a factor and keep the desired order. Here the code:

library(ggplot2)
library(tsibble)
#Data
df1$week <- factor(df1$week,levels = unique(df1$week),ordered = T)
df2$week <- factor(df2$week,levels = unique(df2$week),ordered = T)
#Plot
ggplot(df1, aes(week, sells)) +
  geom_line(aes(color="Period 18/19",group=1)) + 
  geom_line(data=df2, aes(color="Period 19/20",group=1)) +
  labs(color="Legend text")

Output:

enter image description here

Duck
  • 39,058
  • 13
  • 42
  • 84
  • Thanks, this works great. I tried something similar. I put the week as a factor in the ggplot, but it didn't work. Thank you once again – Eischias Dec 14 '20 at 13:13