-1

I have created a mysql query to calculate moving averages of data by using multiple self-joins as shown below. This is consuming lot of time and the data rows are in 100k per query. Any way to further optimize it to reduce time please?

select a.rownum,a.ma_small_price, b.ma_medium_price from
    (SELECT t3.rownum, AVG(t.last_price) as 'ma_small_price' 
FROM temp_data t3
    left JOIN temp_data t ON t.rownum BETWEEN ifnull(t3.rownum,0) - @psmall AND t3.rownum
GROUP BY t3.rownum) 
inner join
    (SELECT t3.rownum, AVG(t.last_price) as 'ma_medium_price' 
FROM temp_data t3
    left JOIN temp_data t ON t.rownum BETWEEN ifnull(t3.rownum,0) - @pmedium AND t3.rownum
GROUP BY t3.rownum) b on a.rownum = b.rownum

enter image description here

  • 1
    What version of mysql are you on? – P.Salmon May 12 '20 at 12:17
  • *`BETWEEN ifnull(t3.rownum,0) - @psmall AND t3.rownum GROUP BY t3.rownum`* - IFNULL is obviously excess. If t3.rownum IS NULL then BETWEEN will give NULL == FALSE (because its upper limit IS NULL). – Akina May 12 '20 at 12:21
  • sql server version is 8.0.13 – Ankur Bansal May 12 '20 at 12:23
  • 2
    "sql server" != "mysql" – Rick James May 12 '20 at 22:35
  • Please in code questions give a [mre]--including cut & paste & runnable minimal code & minimal representative data given as code. For SQL that includes DBMS & DDL, including constraints, indexes & tabular initialization. For SQL performance that includes EXPLAIN results & statistics. Please research & summarize. For SQL that includes basics of optimization/performance--immediately leading to indexes, plans, statistics & SARGability. [Tips for asking a good SQL question](https://meta.stackoverflow.com/a/271056/3404097) Ask re optimization after you have learned & applied those basics. [ask] – philipxy May 15 '20 at 06:59
  • Please clarify via edits, not comments. – philipxy May 15 '20 at 06:59

2 Answers2

1

Since you're running MySQL 8 you should be able to use window functions to get the same result more efficiently. Without seeing sample data it's hard to be 100% certain but this should be close. Note that to use variables in a window frame, you need to use a prepared statement:

SET @sql = '
SELECT rownum,
       AVG(last_price) OVER (ORDER BY rownum ROWS BETWEEN ? PRECEDING AND CURRENT ROW) AS ma_small_price,
       AVG(last_price) OVER (ORDER BY rownum ROWS BETWEEN ? PRECEDING AND CURRENT ROW) AS ma_medium_price
FROM temp_data';
PREPARE stmt FROM @sql;
EXECUTE stmt USING @psmall, @pmedium;

Demo on dbfiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
1

OVER ( ... ) is disappointingly slow -- in both MySQL 8.0 and MariaDB 10.x.

I like "exponential moving average" as being easier to compute than "moving average". The following is roughly equivalent to what Nick proposed. This runs faster, but has slightly different results:

SELECT  rownum,
        @small := @small + 0.5 * (last_price - @small) AS mae_small_price,
        @med   := @med   + 0.2 * (last_price - @med)   AS mae_med_price
    FROM ( SELECT @small := 10, @med := 10 ) AS init
    JOIN temp_data
    ORDER BY rownum;

The coefficient controls how fast the exponential moving average adapts to changes in the data. It should be greater than 0 and less than 1.

The "10" that I initialized the EPA to was a rough guess of the average -- it biases the first few values but is gradually swamped as more values are folded in.

Rick James
  • 135,179
  • 13
  • 127
  • 222