0

I have a table with dates, scores and an id which in this case is not unique.

Here is a sample data set and SQLFiddle of same:

create TABLE variable_test (
      date DATE
    , id INT
    , score INT
);

insert into variable_test
(date,id, score)
VALUES
('2018-01-01',123456,465),
('2018-01-02',123456,165),
('2018-01-03',123456,14),
('2018-01-04',123456,420),
('2018-01-05',123456,4152),
('2018-01-06',123456,1),
('2018-01-07',123456,5923),
('2018-01-08',123456,42),
('2018-01-09',123456,2154),
('2018-01-10',123456,1542),
('2018-01-11',123456,15183),
('2018-01-12',123456,15603),
('2018-01-13',123456,152),
('2018-01-14',123456,210),
('2018-01-15',123456,25),
('2018-01-16',123456,232),
('2018-01-17',123456,1),
('2018-01-18',123456,75),
('2018-01-19',123456,36),
('2018-01-20',123456,45),
('2018-01-21',123456,75212),
('2018-01-22',123456,111),
('2018-01-23',123456,2),
('2018-01-24',123456,2),
('2018-01-25',123456,65),
('2018-01-26',123456,511),
('2018-01-27',123456,210),
('2018-01-28',123456,25),
('2018-01-29',123456,232),
('2018-01-30',123456,1),
('2018-01-31',123456,75),
('2018-02-01',123456,36),
('2018-02-02',123456,45),
('2018-02-03',123456,75212),
('2018-02-04',123456,111),
('2018-02-05',123456,2),
('2018-02-06',123456,2),
('2018-02-07',123456,65),
('2018-02-08',123456,511),
('2018-02-09',123456,210),
('2018-02-10',123456,25),
('2018-02-11',123456,232),
('2018-02-12',123456,1),
('2018-02-13',123456,75),
('2018-02-14',123456,36),
('2018-02-15',123456,45),
('2018-02-16',123456,75212),
('2018-02-17',123456,111),
('2018-02-18',123456,2),
('2018-02-19',123456,2),
('2018-02-20',123456,65),
('2018-02-21',123456,511),
('2018-02-22',123456,210),
('2018-02-23',123456,25),
('2018-02-24',123456,232),
('2018-02-25',123456,1),
('2018-02-26',123456,75),
('2018-02-27',123456,36),
('2018-02-28',123456,45),
('2018-01-01',78910,75212),
('2018-01-02',78910,111),
('2018-01-03',78910,2),
('2018-01-04',78910,2),
('2018-01-05',78910,65),
('2018-01-06',78910,511),
('2018-01-07',78910,210),
('2018-01-08',78910,25),
('2018-01-09',78910,232),
('2018-01-10',78910,1),
('2018-01-11',78910,75),
('2018-01-12',78910,36),
('2018-01-13',78910,45),
('2018-01-14',78910,75212),
('2018-01-15',78910,111),
('2018-01-16',78910,2),
('2018-01-17',78910,2),
('2018-01-18',78910,65),
('2018-01-19',78910,511),
('2018-01-20',78910,210),
('2018-01-21',78910,25),
('2018-01-22',78910,232),
('2018-01-23',78910,1),
('2018-01-24',78910,75),
('2018-01-25',78910,36),
('2018-01-26',78910,45),
('2018-01-27',78910,75212),
('2018-01-28',78910,111),
('2018-01-29',78910,2),
('2018-01-30',78910,2),
('2018-01-31',78910,65),
('2018-02-01',78910,511),
('2018-02-02',78910,210),
('2018-02-03',78910,25),
('2018-02-04',78910,232),
('2018-02-05',78910,1),
('2018-02-06',78910,75),
('2018-02-07',78910,36),
('2018-02-08',78910,45),
('2018-02-09',78910,75212),
('2018-02-10',78910,111),
('2018-02-11',78910,2),
('2018-02-12',78910,2),
('2018-02-13',78910,65),
('2018-02-14',78910,511),
('2018-02-15',78910,210),
('2018-02-16',78910,25),
('2018-02-17',78910,232),
('2018-02-18',78910,1),
('2018-02-19',78910,75),
('2018-02-20',78910,36),
('2018-02-21',78910,45),
('2018-02-22',78910,75212),
('2018-02-23',78910,111),
('2018-02-24',78910,2),
('2018-02-25',78910,2),
('2018-02-26',78910,65),
('2018-02-27',78910,511),
('2018-02-28',78910,210),
('2018-01-01',11121314,25),
('2018-01-02',11121314,232),
('2018-01-03',11121314,1),
('2018-01-04',11121314,75),
('2018-01-05',11121314,36),
('2018-01-06',11121314,45),
('2018-01-07',11121314,75212),
('2018-01-08',11121314,111),
('2018-01-09',11121314,2),
('2018-01-10',11121314,2),
('2018-01-11',11121314,65),
('2018-01-12',11121314,511),
('2018-01-13',11121314,210),
('2018-01-14',11121314,25),
('2018-01-15',11121314,232),
('2018-01-16',11121314,1),
('2018-01-17',11121314,75),
('2018-01-18',11121314,36),
('2018-01-19',11121314,45),
('2018-01-20',11121314,75212),
('2018-01-21',11121314,111),
('2018-01-22',11121314,2),
('2018-01-23',11121314,2),
('2018-01-24',11121314,65),
('2018-01-25',11121314,511),
('2018-01-26',11121314,210),
('2018-01-27',11121314,25),
('2018-01-28',11121314,232),
('2018-01-29',11121314,1),
('2018-01-30',11121314,75),
('2018-01-31',11121314,36),
('2018-02-01',11121314,45),
('2018-02-02',11121314,75212),
('2018-02-03',11121314,111),
('2018-02-04',11121314,2),
('2018-02-05',11121314,2),
('2018-02-06',11121314,65),
('2018-02-07',11121314,511),
('2018-02-08',11121314,210),
('2018-02-09',11121314,25),
('2018-02-10',11121314,232),
('2018-02-11',11121314,1),
('2018-02-12',11121314,75),
('2018-02-13',11121314,36),
('2018-02-14',11121314,45),
('2018-02-15',11121314,75212),
('2018-02-16',11121314,111),
('2018-02-17',11121314,2),
('2018-02-18',11121314,2),
('2018-02-19',11121314,65),
('2018-02-20',11121314,511),
('2018-02-21',11121314,210),
('2018-02-22',11121314,25),
('2018-02-23',11121314,232),
('2018-02-24',11121314,1),
('2018-02-25',11121314,75),
('2018-02-26',11121314,36),
('2018-02-27',11121314,45),
('2018-02-28',11121314,75212);

http://sqlfiddle.com/#!9/cd3a81/4 .

I am trying to iterate through the table for each id and arrive at a 3 day moving average of the scores.

I have the following code which seems to achieve this but I don't fully understand how it's working. It's ordered by id and date and resets whenever the id changes.

What I am trying to understand is how these variable assignments are working.

Maybe there is a much better way to do this. Thanks.

 select date, id, AVG(three_day_total)
from (
SELECT
    date,
    id,
    score,
    CASE @i
    WHEN id
        THEN @i := id
    ELSE (@i := id)
             AND (@n := 0)
             AND (@a0 := 0) AND (@a1 := 0) AND (@a2 := 0)
    END               AS a,
    @a0 := @a1,
    @a1 := @a2,
    @a2 := score,
    (@a0 + @a1 + @a2) AS three_day_total
FROM qa_db.variable_test
ORDER BY id, date
 ) x
group by x.date, x.id

Running the inner section of the query shows more detail of the case statement:

SELECT
    date,
    id,
    score,
    CASE @i
    WHEN id
        THEN @i := id
    ELSE (@i := id)
             AND (@n := 0)
             AND (@a0 := 0) AND (@a1 := 0) AND (@a2 := 0)
    END               AS a,
    @a0 := @a1,
    @a1 := @a2,
    @a2 := score,
    (@a0 + @a1 + @a2) AS three_day_total
FROM qa_db.variable_test
ORDER BY id, date

The main goal is to have a running 3-day total of the score per date and id in each row. I see what this method is doing based on the results. I am trying to understand more clearly HOW this variable assignment in the case state is achieving this or maybe the is a much better approach.

Strawberry
  • 33,750
  • 13
  • 40
  • 57
nation161r
  • 51
  • 1
  • 3
  • 14
  • For each date in the table, I want to iterate through the ids in the table and look at the score for each date and the 2 prior days to pull in a running 3-day average, for each row until the id isn't the same anymore at which point the process resets. The ids can exist for any amount of days so I want to reset the iteration whenever the id changes. I realize there won't be a "prior two days" at the beginning of each new id cycle and those values can just be 0 or NULL, – nation161r Mar 16 '18 at 18:54
  • I strongly suggest you see https://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query and edit your question accordingly – Strawberry Mar 16 '18 at 19:03
  • Here is the SQLFiddle: http://sqlfiddle.com/#!9/cd3a81/4 – nation161r Mar 16 '18 at 19:18
  • Apparently I'm missing something. I apologize. Examples provided, some sample code/approach provided, SQLFiddle provided. Maybe it's my explanation. – nation161r Mar 16 '18 at 20:04
  • Does your query produce the desired result? And are dates contiguous (no gaps)? – Strawberry Mar 17 '18 at 10:57
  • Yes, dates are contiguous, no gaps. The query produces the desired result but it's not clear to me exactly how it's arriving at those results by using the variable assignments and case statement. – nation161r Mar 20 '18 at 16:12
  • I don't think I understand how it's arriving at it either - it seems rather convoluted to me - but that's not to say that it's necessarily inefficient as far as the internal workings of MySQL are concerned. – Strawberry Mar 20 '18 at 17:43

1 Answers1

0

I don't understand how the results are being calculated - or even if they're really being calculated in the way that you want.

Let's reduce the data set:

SELECT * FROM variable_test WHERE id IN(78910,123456) AND date < '2018-01-07';
+------------+--------+-------+
| date       | id     | score |
+------------+--------+-------+
| 2018-01-01 |  78910 | 75212 |
| 2018-01-01 | 123456 |   465 |
| 2018-01-02 |  78910 |   111 |
| 2018-01-02 | 123456 |   165 |
| 2018-01-03 |  78910 |     2 |
| 2018-01-03 | 123456 |    14 |
| 2018-01-04 |  78910 |     2 |
| 2018-01-04 | 123456 |   420 |
| 2018-01-05 |  78910 |    65 |
| 2018-01-05 | 123456 |  4152 |
| 2018-01-06 |  78910 |   511 |
| 2018-01-06 | 123456 |     1 |
+------------+--------+-------+

What should the desired result of this data set look like (show calculations)?

Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • When running the inner section of the query here: http://sqlfiddle.com/#!9/cd3a81/7 There are columns for each for the variable assignments in the case statement. It looks to be assigning a score from 2days ago, 1day ago, and current row date, all to 1 row per date field, then summing it for a three_day_total per row. You can see the values sort of "staircase" through the columns for every three rows. This inner result set is really what I am eventually querying. I definitely agree this is convoluted. – nation161r Mar 20 '18 at 18:38
  • But it's just providing a rolling 3 day sum (ordered by user). This isn't an average, and wherever the formula transitions from one user to another , the values provided are (briefly) misleading . This doesn't seem to fulfill the brief - but perhaps I've misunderstood – Strawberry Mar 20 '18 at 19:46
  • You're correct. That part can be swapped for sum or avg I suppose. Somehow the variables after the "END" statement (@a0:=@a1, @a1:=@a2 and so on) are grabbing values from previous rows even though they are all initially assigned a value of 0 in the case statement. Seems like it would just reassign 0 to each other since they equal 0. I know I'm butchering my own question. – nation161r Mar 20 '18 at 20:34