-1

My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.

The table contains id (auto_increment), date and close (Float).

This is my code:

CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT)
    NO SQL
BEGIN
    select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg   
    from hist_ask    
    join hist_ask as past     
      on past.id between hist_ask.id - (periode-1)  and hist_ask.id
    group by hist_ask.id, hist_ask.close 
    ORDER BY hist_ask.id DESC 
    LIMIT 10;
END

The table I use looks like this

id , date       , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on

The output looks like this:

The output I get from the query

Thanks in advance!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
rodarmy
  • 1
  • 2
  • See: [Why should I provide an MCVE for what seems to me to be a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query) – Strawberry Nov 12 '18 at 12:43
  • Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output – Madhur Bhaiya Nov 12 '18 at 12:44
  • the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id. – rodarmy Nov 12 '18 at 16:12

1 Answers1

0

I finaly make it work using a temporary table. I can now give two parameters to the procedure:

  1. periode: the periode the moving average is calculated with
  2. _limit: limits the result set

Important for performance is the

ALTER TABLE temp
    ENGINE=MyISAM;

statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds

Thats the code:

CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
    NO SQL
BEGIN

 DECLARE a FLOAT DEFAULT 0;
 DECLARE i INT DEFAULT 0;
 DECLARE count_limit INT  DEFAULT 0;

   SET @rn=0;
 CREATE TEMPORARY TABLE IF NOT EXISTS temp (
                SELECT 
                    @rn:=@rn+1 AS pri_id, 
                    date, 
                    close , a AS 
                    mavg 
                FROM hist_ask);

ALTER TABLE temp
ENGINE=MyISAM;

 SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
 SET count_limit= (i-_limit)-periode;


WHILE i>count_limit DO
SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
UPDATE temp SET mavg=a WHERE pri_id=i;
SET i=i-1;
END WHILE;

SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


DROP TABLE temp;

END

The result looks like that:

CALL `moving_avg`(3,5)
  • pri_id, date, close, mavg
  • 1999 2018-09-13 12086.6 12032.03
  • 1998 2018-09-11 12002.2 11983.47
  • 1997 2018-09-10 12007.3 11976.53
  • 1996 2018-09-07 11940.9 11993.80
  • 1995 2018-09-06 11981.4 12089.23

5 row(s) returned 0.047 sec / 0.000 sec

rodarmy
  • 1
  • 2