reproducible example
df=structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L), year = c(1973L, 1974L, 1975L, 1976L, 1977L, 1978L,
1973L, 1974L, 1975L, 1976L, 1977L, 1978L), Jan = c(9007L, 7750L,
8162L, 7717L, 7792L, 7836L, 9007L, 7750L, 8162L, 7717L, 7792L,
7836L), Feb = c(8106L, 6981L, 7306L, 7461L, 6957L, 6892L, 8106L,
6981L, 7306L, 7461L, 6957L, 6892L), Mar = c(8928L, 8038L, 8124L,
7767L, 7726L, 7791L, 8928L, 8038L, 8124L, 7767L, 7726L, 7791L
), Apr = c(9137L, 8422L, 7870L, 7925L, 8106L, 8192L, 9137L, 8422L,
7870L, 7925L, 8106L, 8192L), May = c(10017L, 8714L, 9387L, 8623L,
8890L, 9115L, 10017L, 8714L, 9387L, 8623L, 8890L, 9115L), Jun = c(10826L,
9512L, 9556L, 8945L, 9299L, 9434L, 10826L, 9512L, 9556L, 8945L,
9299L, 9434L), Jul = c(11317L, 10120L, 10093L, 10078L, 10625L,
10484L, 11317L, 10120L, 10093L, 10078L, 10625L, 10484L), Aug = c(10744L,
9823L, 9620L, 9179L, 9302L, 9827L, 10744L, 9823L, 9620L, 9179L,
9302L, 9827L), Sep = c(9713L, 8743L, 8285L, 8037L, 8314L, 9110L,
9713L, 8743L, 8285L, 8037L, 8314L, 9110L), Oct = c(9938L, 9129L,
8466L, 8488L, 8850L, 9070L, 9938L, 9129L, 8466L, 8488L, 8850L,
9070L), Nov = c(9161L, 8710L, 8160L, 7874L, 8265L, 8633L, 9161L,
8710L, 8160L, 7874L, 8265L, 8633L), Dec = c(8927L, 8680L, 8034L,
8647L, 8796L, 9240L, 8927L, 8680L, 8034L, 8647L, 8796L, 9240L
)), .Names = c("group", "year", "Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), class = "data.frame", row.names = c(NA,
-12L))
Perfrom Forecat by group
library(forecast)
ld <- split(df[, -1], df$group)
ld <- lapply(ld, function(x) {ts(c(t(x[,-1])), start = min(x[,1]), frequency = 12)})
lts <- lapply(ld, ets, model = "ZZZ")
So result
$`1`
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Jan 1979 8397.497 8022.399 8772.595 7823.834 8971.160
Feb 1979 7599.221 7162.825 8035.616 6931.812 8266.630
Mar 1979 8396.595 7906.510 8886.679 7647.075 9146.115
Apr 1979 8646.510 8108.063 9184.957 7823.026 9469.994
From 1979 year, it is forecasted values, i want get result of residuals for 1973-1978.(initiall values)
res <- lapply(lts, residuals)
and the result
$`1`
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
1973 497.69233 99.50607 64.44947 -15.20925 77.85009 390.89045 -277.67369 26.92614 72.42590 -85.69894 -338.10035
and so on
Questions
1. How result of residual to join in summary table. For example something like this
- Question:
For 1979 and more we see forecasted value, but for 1973-1978 in the column point forecast we see the residuals.
Ideally, of course, get not so much residual, but the original values and forecasted values.
So i don't know how for initiall data 1973-1978 join in summary table original values something like this
df[df$year == 1973,]
but how for all year... Then from original values subtract residial and got forecasted value (Maybe I complicate the task a lot, but otherwise I don’t know how to get the desired output)
colnames point forecast
,lo80
and hi80
is not need for changing, i will be remember that for initial values they mean residual, original and forecasted.
Is it possible to do it using dplyr or data.table solution?
# Tidy-up the splits
ld <- lapply(ld, function(x) {
x %>%
gather(key, value, -year) %>%
unite(date, year, key, sep = "-") %>%
mutate(date = paste0(date, "-01")) %>%
mutate(date = as.Date(date, format = "%Y-%b-%d"))
})
the result
$`1`
date value
1 <NA> 9007
2 <NA> 7750
3 <NA> 8162
4 <NA> 7717
5 <NA> 7792
6 <NA> 7836
7 <NA> 8106
8 <NA> 6981
9 <NA> 7306
10 <NA> 7461
11 <NA> 6957
12 <NA> 6892
ld=dput()
ld <- lapply(ld, function(x) {
yr <- lubridate::year(min(x$date))
mth <- lubridate::month(min(x$date))
timetk::tk_ts(data = x, select = value, frequency = 12,
start = c(yr, mth))
})
error
Error in x$date : $ operator is invalid for atomic vectors
edit3
> lts_all <- lapply(names(lts), function(x, lts) {
+ output_fit <- lts[[x]][["res_fit_tbl"]] %>%
+ mutate(group = x)
+ output_fcst <- lts[[x]][["res_fcst_tbl"]] %>%
+ mutate(group = x)
+
+ return(list(output_fit=output_fit, output_fcst=output_fcst))
+ }, lts)
> lts_all
[[1]]
[[1]]$output_fit
# A tibble: 72 x 6
date value residuals CI95_upper CI95_lower group
<date> <dbl> <dbl> <dbl> <dbl> <chr>
1 1973-01-01 8509 498 9083 7936 value
2 1973-02-01 8006 99.5 8580 7433 value
3 1973-03-01 8864 64.4 9437 8290 value
4 1973-04-01 9152 - 15.2 9726 8579 value
5 1973-05-01 9939 77.9 10513 9365 value
6 1973-06-01 10435 391 11009 9861 value
7 1973-07-01 11595 -278 12168 11021 value
8 1973-08-01 10717 26.9 11291 10143 value
9 1973-09-01 9641 72.4 10214 9067 value
10 1973-10-01 10024 - 85.7 10597 9450 value
# ... with 62 more rows