I'm using ggplot2
to plot some time-series data along with a linear regression line. I'm interested to determine when the regression line will hit 82%. Visual inspection of the graph suggests that this will happen around November 15, 2017. But when I use R's predict.lm()
function, I get a different answer: Aug. 12, 2017. Shouldn't these two methods give me the same answer? Ultimately, I'd like to annotate the graph with a text label that shows the intercept date.
require(ggplot2)
temp <- "End.Date Save.Rate
1 2015-05-31 0.67
2 2015-07-31 0.67
3 2015-09-30 0.69
4 2015-11-30 0.71
5 2016-01-30 0.70
6 2016-03-31 0.72"
df <- read.table(text = temp, header = TRUE)
df$End.Date <- as.POSIXct(df$End.Date, origin="1970-01-01", tzone="America/New_York")
save.rate.lm = lm(End.Date ~ Save.Rate, data=df)
newdata <- data.frame(Save.Rate = 0.82)
temp <- predict.lm(save.rate.lm, newdata)
predicted.date <- as.POSIXct(as.data.frame(temp)[1,], origin="1970-01-01",
tzone="America/New_York")
print(predicted.date)
x.lims <- c(as.POSIXct(NA), as.POSIXct("2017-12-31", origin="1970-01-01",
tzone="America/New_York"))
p <- ggplot(df, aes(x=End.Date, y=Save.Rate)) +
geom_point() +
stat_smooth(method='lm', fill=NA, fullrange=TRUE) +
theme(axis.text.x=element_text(angle = -45, hjust = 0)) +
scale_y_continuous(labels = percent) +
scale_x_datetime(breaks = date_breaks('month'), labels = date_format('%b-%Y'),
limits=x.lims) +
geom_hline(yintercept=0.82)
print(p)