0

This is dataframe (df)

         ds         y
1 2020-09-25 42,034.88
2 2020-09-24 41,806.37
3 2020-09-23 41,876.26
4 2020-09-22 41,828.91
5 2020-09-21 42,174.13

dput of this data:

    structure(list(ds = structure(c(14613, 14614, 14615, 14616, 14617, 
14620, 14621, 14622, 14623, 14624, 14627, 14628, 14629, 14630, 
14631, 14634, 14635, 14636, 14637, 14638, 14641, 14642, 14643, 
14644, 14648, 14649, 14650, 14651, 14652, 14655, 14656, 14657, 
14658, 14659, 14662, 14663, 14664, 14665, 14666, 14669, 14670, 
14671, 14672, 14673, 14676, 14677, 14678, 14679, 14680, 14683, 
14684, 14685, 14686, 14687, 14690, 14692, 14693, 14694, 14697, 
14698, 14699, 14700, 14701, 14704, 14705, 14706, 14707, 14708, 
14711, 14712, 14713, 14714, 14715, 14718, 14719, 14720, 14721, 
14722, 14725, 14726), class = "Date"), y = c(9437.85, 9657.38, 
9727.36, 9737.47, 9776.21, 9797, 9778.36, 9784.85, 9802.45, 9923.14, 
9895.46, 9954.41, 9904.74, 9753.84, 9774.07, 9689.2, 9666.48, 
9603.55, 9579.81, 9614.19, 9591.5, 9595.24, 9627.63, 9769.73, 
9809.98, 9786.46, 9733.36, 9802.8, 9805.87, 9701.81, 9769.68, 
9867.09, 9889.3, 9902.62, 9953.07, 9823.57, 9686.18, 9667.17, 
9657.79, 9498.56, 9546.38, 9419.43, 9511.53, 9626.29, 9740.19, 
9787.03, 9784.98, 9879.7, 10025.99, 10088.45, 10017.71, 9981.48, 
10007.87, 10000.93, 9963.35, 10146.27, 10127.03, 10137.93, 10056.46, 
10073.76, 10178.43, 10246.77, 10416.51, 10447.84, 10419.82, 10523.01, 
10533.57, 10586.46, 10557.19, 10506.2, 10570.88, 10677.47, 10659.21, 
10669.88, 10641.52, 10590.21, 10615.15, 10607.03, 10556.37, 10556.38
)), row.names = 2663:2584, class = "data.frame")

Code tried for Prophet forecast:

  try <- prophet(df)
  future <- make_future_dataframe(try, periods = 31, freq = "day")
  forecast <- predict(try,future)

Error in Prophet

Error in if (m$y.scale == 0) { : missing value where TRUE/FALSE needed In addition: Warning message: In setup_dataframe(m, history, initialize_scales = TRUE) :

Code tried for Arima (data starts from 4th April, weekends missing and may be a day or two missing in the middle like Public holidays or no transaction):

x = ts(df$y, start = c(2010,1,4), frequency = 365)
arima1 = auto.arima(x)
forecast1 = forecast(arima1,h = 31)

Error in Auto Arima:

Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, : 'x' must be numeric In addition: Warning message: In is.constant(x) : NAs introduced by coercion

Any advice?

A.B.
  • 83
  • 1
  • 8
  • 1
    can you add the result of `dput(df)`, I need to know the class of each column. PS: `try` is a really unfortunate choice for a variable name – Edo Sep 25 '20 at 12:21
  • 1
    `y` is not numeric. That's why you have troubles. Try this: `df$y <- as.numeric(gsub(",", "", df$y))` – Edo Sep 25 '20 at 12:30
  • PS: the output of dput is really weird. Did you edit it manually? The data.frame has 2663 rows, but it's columns are respectively 80 and 42 units long. It shouldn't be like this. – Edo Sep 25 '20 at 12:34
  • the number format worked but the 'ts' structure shows 2010 to 2017 although time series is uptil today 25-09-2020 Time-Series [1:2663] from 2010 to 2017: 9438 9657 9727 – A.B. Sep 25 '20 at 14:29
  • Because that's not how you should define `start` in a `ts`. Try: `ts(df$y, start = eval(parse(text = format(min(df$ds), "c(%Y, %j)"))), frequency = 365.25)` – Edo Sep 25 '20 at 14:34
  • Watch out because in the data you provided time is inveted. Sort it up first. – Edo Sep 25 '20 at 14:35
  • Yes i reversed it like this 'df <- df[rev(1:nrow(df)),]' and tried your suggestion for ts, still same message timeseries 2010 to 2017 – A.B. Sep 25 '20 at 14:36
  • your dput is wrong, so I can't help you. – Edo Sep 25 '20 at 14:37
  • dput is fine , i didnt put the whole thing here Its public data, let me know your email, i will send you – A.B. Sep 25 '20 at 14:37
  • you print it in the wrong way. You should have done: `dput(head(df, 80))` – Edo Sep 25 '20 at 14:38
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222081/discussion-between-a-b-and-edo). – A.B. Sep 25 '20 at 14:39
  • My suggestion works with your new data. I don't understand what problem you have now. – Edo Sep 25 '20 at 14:42
  • did you try? can you share your ts shows 2010 to 2020 instead of 2010 to 2017 – A.B. Sep 25 '20 at 14:48
  • If the first date is in fact "2010-01-04" and you data is 2663 days, the last date is "2017-04-19". You can't get till 2020 unless you have missing data or your series is not daily. – Edo Sep 25 '20 at 14:53
  • yes that's what i asked in question, my data would have missing dates due to weekend, public holidays and no transaction days – A.B. Sep 25 '20 at 15:06
  • It wouldn't be right for you to make a model arima on a daily basis because you have NA, try to make it weekly or monthly. – cdcarrion Sep 25 '20 at 15:46
  • how to make it weekly? – A.B. Sep 25 '20 at 17:52

0 Answers0