2

Consider the following data:

library(ggplot2)
library(lubridate)

date <- seq.Date(ymd("2015-01-01"), Sys.Date(), by = "day")

df <- data.frame(date = date,
                 value = seq_along(date) + rnorm(length(date), sd = 100))

# Add yday and year
df$yday <- yday(df$date)
df$year <- year(df$date)

head(df)
#         date value yday year
# 1 2015-01-01    97    1 2015
# 2 2015-01-02    89    2 2015
# 3 2015-01-03    68    3 2015
# 4 2015-01-04    57    4 2015
# 5 2015-01-05    70    5 2015
# 6 2015-01-06   100    6 2016

I would like to make a "year over year" plot with color assigned to year. I can do this with the following:

ggplot(df, aes(x = yday, y = value, color = factor(year))) +
  geom_line()

Plot

But this results in the x-axis being "day of the year" rather than month labels. Adding + scale_x_date() fails because yday is no longer a date.

Is is possible to use scale_x_date()?

At the end of the day, I would like to do something like this:

ggplot(df, aes(x = date, y = value, color = factor(year))) +
  geom_line() +
  scale_x_date(date_labels = "%b")

But keep the years "stacked" on the same plot.

JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • I take it you don't want to use the work-around given in the answer to your [previous question](http://stackoverflow.com/questions/28503262/using-lubridate-and-ggplot2-effectively-for-date-axis)? Maybe clarify what's different about this, otherwise it looks like a duplicate. – aosmith Oct 04 '16 at 20:32

1 Answers1

11

How about this hack: We don't care what year yday comes from, so just convert it back to Date format (in which case the year will always be 1970, regardless of the actual year that a given yday came from) and display only the month for the x-axis labels.

You don't really need to add yday or year columns to your data frame, as you can create them on the fly in the ggplot call.

ggplot(df, aes(x = as.Date(yday(date), "1970-01-01"), y = value, 
               color = factor(year(date)))) +
  geom_line() +
  scale_x_date(date_breaks="months", date_labels="%b") +
  labs(x="Month",colour="") +
  theme_bw()

There's probably a cleaner way, and hopefully someone more skilled with R dates will come along and provide it.

enter image description here

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • I like it, but what do we do, if we need to compare Sep-Apr (which includes a year change). With this solution we get an ugly gap in the middle and a wrong order (Jan-Apr; Sep-Dec). Hmm.... – stats-hb Jul 28 '20 at 09:50