0

I am using MySQL to create a window function per category in another column. I am thinking of maybe combining a pivot table with a window function, however, do not know how to do a pivot table with MySQL.

Below is what I have in mind to do, create a column per cateogry and calculate the average over a given time range:

I have:

firstname lastname category amount timestamp
Tom Sta hair 1 2020-06-21
Chris Danny school 2 2020-06-22
Gee Elle books 1 2020-06-21
Tom Sta books 10 2020-06-23
Chris Danny hair 2 2020-06-25
Gee Elle school 15 2020-06-28

I want:

firstname lastname hair_last_3_days school_last_3_days boks_last_3_days
Tom Sta 1 1 21
Chris Danny 2 2 202
Gee Elle 2 1 1
Tom Sta 4 10 6
Chris Danny 3 2 0
Gee Elle 2 15 28

At the moment I'm able to create the averages

select first, last, date(trans_date_trans_time), amt, category,
avg(amt) over(partition by first, last, category order by date(trans_date_trans_time) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) avg_last_3 from table;

However I realise that this will only work if the days are one after the other and won't work if there are two rows from the same day for one person. I'm looking for some help. Thanks

2 Answers2

0

If you have data on all days, then you can use conditional logic in the avg():

select first, last, date(trans_date_trans_time), amt, category,
       avg(case when category = 'hair' then amt enfd) over
           (partition by first, last, category
            order by date(trans_date_trans_time)
            ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
           ) as avg_hair_last_3
from table;

But a better solution is to use a time window frame:

select first, last, date(trans_date_trans_time), amt, category,
       avg(case when category = 'hair' then amt enfd) over
           (partition by first, last, category
            order by date(trans_date_trans_time)
            rows between interval 2 day preceding and current row
           ) as avg_hair_last_3
from table;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

If it was me, I'd execute something like the following, and handle any remaining issues in application code...

SELECT x.firstname
     , x.lastname
     , x.category
     , SUM(amount) total
  FROM my_table x
  JOIN 
     ( SELECT firstname  
            , lastname
            , category
            , MAX(timestamp) timestamp
         FROM my_table
        GROUP 
           BY firstname
            , lastname
            , category
     ) y
    ON y.firstname = x.firstname 
   AND y.lastname = x.lastname
   AND y.category = x.category
   AND x.timestamp BETWEEN y.timestamp - INTERVAL 2 DAY AND y.timestamp
 GROUP
    BY x.firstname
     , x.lastname
     , x.category 

Note that this isn't the 'average' of anything, because I don't really understand how your result set matches up to your data set.

Strawberry
  • 33,750
  • 13
  • 40
  • 57