0

I was looking for a good way to do time series projection in BigQuery, and found this one, which nicely works to calculate correlations and slope: View Post. But it doesn't help to extend the timeline to your choice.

But can anyone please suggest a complete solution, where I can extend the timeline (x) according to my need and get projections of (Y) using a single query?

Any help will be highly appreciated.

1 Answers1

5

The basic idea is to left join the model specs to a generated dates table and use it:

WITH stats AS (
  SELECT * FROM UNNEST([
    STRUCT( 'a' AS model, 0.3 AS slope, 11 AS intercept ),
    STRUCT( 'b', 0.2, 7)
  ])
)

SELECT
  date,
  model,
  slope,
  intercept,
  UNIX_DATE(date) AS X,
  slope * UNIX_DATE(date) + intercept AS Y
FROM
  UNNEST(GENERATE_DATE_ARRAY(DATE('2018-05-01'),DATE('2018-07-01'))) AS date
  LEFT JOIN stats ON TRUE
ORDER BY date ASC

I did not repeat the statistics part since it is already answered, but I created a dummy table with two models which replaces it, The model can also be a bucket of course, then you'd have to left join on that as a key.

I'm also assuming you created the model with dates using unix date (days since 1970-01-01), if not you need to modify accordingly.

Martin Weitzmann
  • 4,430
  • 10
  • 19
  • Hi @Martin Weitzmann, Sorry for the delayed response; but can't thank you enough for your wonderful answer. The cross join does the trick. Two questions: (a) You wrote "add your calculations here"--what calculations do you mean? (b) I want to extend the first part of the query to multiple regressions; can you help me modify the statistical part? Now everybody seems to believe BigQuery ML is answer to all forecasting problems. But that too uses LR to create training models most cases. I may be wrong, but the BQML technique seems to be an overkill for simple forecasting works. – AliveToLearn Aug 11 '18 at 02:46
  • Happy to help, you could consider accepting the answer (click on the check mark beside the answer to toggle it from greyed out to filled in). ad (a): was a leftover, I removed it (b) would be best to ask a new question to the community, it would somewhat bloat the answer here and not really fit the question – Martin Weitzmann Aug 11 '18 at 08:29
  • Posted a new question. Here :[https://stackoverflow.com/questions/51798133/forecasting-using-multiple-regression-in-bigquery] – AliveToLearn Aug 11 '18 at 09:04