4

Consider an random data.frame:

d <- data.frame(replicate(10,sample(0:1,1000,rep=TRUE)))

I want to consider each row as a unique time-series (in this case for ten years). So first, I need to transform the data to time-series. I have tried the following code:

d1 <- ts(d, start=2000, end=2009)

However, this code consider the time-series as one long time-series for 100 years I think. In my case I want 1,000 unique time-series for 10 years.

And then I want to forecast each 1,000 time-series (let's say 1 year). By using the following code:

fit <- tslm(d1~trend) fcast <- forecast(fit, h=1) plot(fcast)

I get one forecast (since I in my dataset, d1, only consider one time-series).

Can anyone help me with this?

Michael
  • 565
  • 4
  • 11

2 Answers2

4

If we are looking for creating time series for each column, then loop through the columns of the dataset with lapply and create it

library(forecast)
lst1 <- lapply(d, ts, start = 2000, end = 2009)
#If we want to split by `row`
#lst1 <- lapply(asplit(as.matrix(d), 1), ts, start = 2000, end = 2009)
par(mfrow = c(5, 2))
lapply(lst1, function(x) {
        fit <- tslm(x ~ trend)
        fcast <- forecast(fit, h = 1)
        plot(fcast)
   })

enter image description here

akrun
  • 874,273
  • 37
  • 540
  • 662
  • Great. Thanks! However, I get the error "Error in plot.new() : figure margins too large" when I run it. I run my data set d, and then your code. Do I do anything wrong? – Michael Oct 10 '19 at 17:58
  • @Michael Probably you are doing in `Rstudio`? Just increase the plot window – akrun Oct 10 '19 at 17:59
  • A newbie question: But how do I do that? – Michael Oct 10 '19 at 18:04
  • @Michael I would stretch the plot window with the cursor. Also, on a smaller window, you can change `par(mfrow = c(2, 2))` i.e. 2 plots per each row, 4 plots in total – akrun Oct 10 '19 at 18:05
  • 1
    Of course. I stetch in the wrong direction. Now it works. Thanks a lot! :-) – Michael Oct 10 '19 at 18:06
  • However, does this code only do it for 10 time-series? Because I want to do it for each 1,000 row in the datasets d. I know it is propably impossible to show it in one graph, but can I then just create a new dataset with the forecasted value for each 1,000 time-series. Hope my question make sense. – Michael Oct 10 '19 at 18:09
  • @Michael Here, we are looping on each column. So, it doesn't matter whether the number of rows is 10 or 1000. If the number of columns are 1000, it may be better to have multiple graphs or a single pdf graph with multiple pages – akrun Oct 10 '19 at 18:11
  • Ah, now I see. I don't want to consider each column as a time-series. I want to consider each row as a time-series. I think I wrote that in the description - sorry if I was not clear enough. – Michael Oct 10 '19 at 18:13
  • @Michael In that case, just change the function `lst1 <- lapply(asplit(as.matrix(d), 1), ts, start = 2000, end = 2009)` – akrun Oct 10 '19 at 18:15
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/200668/discussion-between-michael-and-akrun). – Michael Oct 10 '19 at 18:21
2

@akrun shows how to do it using base R and the forecast package.

Here's how to do the same thing using the new fable package which is designed to handle this sort of thing.

library(tidyverse)
library(tsibble)
library(fable)

set.seed(1)
d <- data.frame(replicate(10, sample(0:1, 1000, rep = TRUE)))
# Transpose
d <- t(d)
colnames(d) <- paste("Series",seq(NCOL(d)))
# Convert to a tsibble
df <- d %>%
  as_tibble() %>%
  mutate(time = 1:10) %>%
  gather(key = "Series", value = "value", -time) %>%
  as_tsibble(index = time, key = Series)
df
#> # A tsibble: 10,000 x 3 [1]
#> # Key:       Series [1,000]
#>     time Series   value
#>    <int> <chr>    <int>
#>  1     1 Series 1     0
#>  2     2 Series 1     1
#>  3     3 Series 1     0
#>  4     4 Series 1     0
#>  5     5 Series 1     1
#>  6     6 Series 1     0
#>  7     7 Series 1     0
#>  8     8 Series 1     0
#>  9     9 Series 1     1
#> 10    10 Series 1     0
#> # … with 9,990 more rows
# Fit models
fit <- model(df, TSLM(value ~ trend()))
# Compute forecasts
fcast <- forecast(fit, h = 1)
# Plot forecasts for one series
fcast %>%
  filter(Series == "Series 1") %>%
  autoplot(df)

Created on 2019-10-11 by the reprex package (v0.3.0)

Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85