4

I'm looking for a way to force the date labels on a ggplot to start at a (seemingly) logical time. I've had the problem a number of times but my current problem is I want the breaks to be on the 01/01/yyyy My data is a large dataset with POSIXct Date column, data to plot in Flow column and a number of site names in the Site column.

library(ggplot2)
library(scales) 
ggplot(AllFlowData, aes(x=Date, y = Flow, colour = Site))+geom_line()+
scale_x_datetime(date_breaks = "1 year", expand =c(0,0),labels=date_format("%Y"))

I can force the breaks to be every year and they appear okay without the labels=date_format("%Y") (starting on 01/01 each year) but if I include labels=date_format("%Y") (as there is 10 years of data so gets a bit messy) the date labels move to ~November, and 1989 is the first label even though my data starts on the 01/01/1990.

I have had this problem numerous times in the past on different time steps, such as wanting to force it to the 1st of the month or daily times to be at midnight instead during the day. Is there a generic way to do this?

I have looked at create specific date range in ggplot2 ( scale_x_date), but I do not want to have to hard code my breaks as I have a fair few plots to do with different date ranges.

Thanks

J.Con
  • 4,101
  • 4
  • 36
  • 64
Sarah
  • 3,022
  • 1
  • 19
  • 40
  • You can set the breaks explicitly without hard-coding them by getting the date range directly from the data. For example, `library(lubridate); breaks = with(AllFlowData, seq(as.POSIXct(paste0(year(min(Date)),"-01-01")), max(Date)), by="year")`. Or, just create a date sequence vector for `breaks` that spans the entire range of all of your data frames and use that same date sequence vector for all the plots. It won't affect the axis limits and will ensure that each plot has breaks on January 1st (or whenever) of each year. – eipi10 Jul 06 '17 at 05:45

1 Answers1

5

If the dates come to you in a vector like:

dates <- seq.Date(as.Date("2001-03-04"), as.Date("2001-11-04"), by="day")
## "2001-03-04" "2001-03-05" "2001-03-06" ... "2001-11-03" "2001-11-04"

use pretty.Dates() to make a best guess about the end points.

range(pretty(dates))
## "2001-01-01" "2002-01-01"

Then pass this range to ggplot.

However, I recommend coord_cartesian() instead of scale_x_date(). Typically I want to crop the graphic bounds, instead of flat-out exclude the values entirely (which can mess up things like a loess summary).

wibeasley
  • 5,000
  • 3
  • 34
  • 62