0

I have following function:

ts.dat <- ts(data=dat$sales, start = 1, frequency = 12)

ts. dat returns

   Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
1 9000 8600 8500 8600 8500 8300 8600 9100 8800 8700 9300 7900
2 7900 8800 8500 8900 9000 8800 8800 9100 9500 8900 9200 8400
3 8400 9200 9500 9100 8700 8300   NA

However,

 plot(stl(ts.dat, s.window=12))

returns

Error in na.fail.default(as.ts(x)) : missing values in object'plot':

I tried na.action=na.pass, but it didn't work. Any idea how to deal with the NA, if that is the reason?

Also: Any chance to take the first date from dat as the start?

Steffen Moritz
  • 7,277
  • 11
  • 36
  • 55
JohnnyDeer
  • 231
  • 4
  • 14

2 Answers2

3

Any idea how to deal with the NA, if that is the reason?

You need to use na.action = na.omit, i.e., dropping NA when doing computation.

plot(stl(ts.dat, s.window=12, na.action = na.omit))

stl

The na.pass will simply assume NA as normal observation. But it will still produce error as stl() later calls compiled code and can not recognize NA.

Any chance to take the first date from dat as the start?

Have a look at the examples at the bottom of ?ts:

 ## Using July 1954 as start date:
 gnp <- ts(cumsum(1 + round(rnorm(100), 2)),
           start = c(1954, 7), frequency = 12)

To start from July 1954, put start = c(1954, 7).

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • Perfect, thank you. I only wondered if it's somehow possible to automate this so that not with every new dataset that already has a date, this column needs to be adjusted. I thought it could be possible to read that information from the first "month" row from dat. – JohnnyDeer Jun 29 '16 at 06:35
  • I think this solution has a problem: if you simply take out the NA values, your data does not have the same "dates". For example, if you have Day 1: 3, Day 2: NA, Day 3: 5, then na.omit will result in Day 1: 3, Day 2: 5. This can cause problems with seasonality and trends, in my opinion... – Javi_VM Nov 23 '18 at 09:28
1

You could also impute the missing data in your time series. (replacing the NA with a reasonable value)

There are R packages to do this (e.g. imputeTS or zoo).

Especially imputeTS has some functions that are very good choices for replacing missing data in time series with seasonality. (na_seadec() or na_kalman()) (it also has other imputation function - here an overview)

A solution here would look like this:

library(imputeTS)
x <- na_seadec(ts.dat)
plot(stl(x, s.window=12))
Steffen Moritz
  • 7,277
  • 11
  • 36
  • 55