2

I have a table in SQLite dumped from a CSV logging file with 20-50 columns, up to 500k rows. I would like to be able to query various columns in a few ways: instantaneous value, moving average value, and slope. I was contemplating adding columns to the table for the moving averages and slopes, but I thought I would investigate the query method first. I found some moving average help and slope help from SO which got me started.

To test out what I'm trying to do, I created the following SQL script:

DROP TABLE IF EXISTS TEST;
CREATE TABLE TEST (ID INTEGER UNIQUE PRIMARY KEY ASC, c1, c2);
INSERT INTO TEST (c1, c2) VALUES (1,5), (7,10), (11, 15), (5, 20), (8, 25), (10, 30), (12, 35), (15, 40), (18, 45), (20, 50);

--Create VIEW for Moving average (3 rows) of column c1
DROP VIEW IF EXISTS AVG_VIEW;
CREATE TEMP VIEW IF NOT EXISTS AVG_VIEW AS
    SELECT * FROM TEST AS t,
        (SELECT t1.ID, AVG(t2.c1) AS mavg
            FROM TEST AS t1, TEST AS t2
            WHERE t2.ID BETWEEN (t1.ID-2) AND t1.ID
            GROUP BY t1.ID) sq
        WHERE (t.ID = sq.ID);

--Create view for slope of 2 consecutive moving averages (just use dif for now)
DROP VIEW IF EXISTS SLOPE_VIEW;
CREATE TEMP VIEW IF NOT EXISTS SLOPE_VIEW AS
  SELECT a1.*, (a1.mavg - a2.mavg) AS slope
      FROM AVG_VIEW AS a1, AVG_VIEW AS a2
      WHERE a2.ID = (a1.ID-1)
      GROUP BY a1.ID;

SELECT * FROM SLOPE_VIEW;

This gives the expected result:

ID c1 c2  ID:1  mavg     slope
2  7  10  2    4        3
3  11 15  3    6.3333   2.333
4  5  20  4    7.6667   1.333
5  8  25  5    8        0.333
6  10 30  6    7.6667  -0.333
7  12 35  7    10       2.333
8  15 40  8    12.333   2.333
9  18 45  9    15       2.667
10 20 50  10   17.667   2.667

I ran "EXPLAIN QUERY PLAN" on the SELECT statement, but I couldn't figure out how to interpret it:

selectid order from detail
1        0     0    SCAN TABLE TEST AS t1 USING COVERING INDEX sqlite_autoindex_TEST_1
1        1     1    SEARCH TABLE TEST AS t2 USING INTEGER PRIMARY KEY (rowid>? AND rowid<?)
2        0     0    SCAN TABLE TEST AS t1 USING COVERING INDEX sqlite_autoindex_TEST_1
2        1     1    SEARCH TABLE TEST AS t2 USING INTEGER PRIMARY KEY (rowid>? AND rowid<?)
0        0     1    SCAN SUBQUERY 1 AS sq
0        1     0    SEARCH TABLE TEST AS t USING INTEGER PRIMARY KEY (rowid=?)
0        2     2    SEARCH TABLE TEST AS t USING INTEGER PRIMARY KEY (rowid=?)
0        0     0    EXECUTE LIST SUBQUERY 3
0        3     3    SEARCH SUBQUERY 2 AS sq USING AUTOMATIC COVERING INDEX (ID=?)
0        0     0    USE TEMP B-TREE FOR GROUP BY

My question is: is this a really bad plan/query? Can you explain the query-plan? Should I be storing one or both of the average/slope in the table? Is there a better way to accomplish this?

Community
  • 1
  • 1
6utt3rfly
  • 208
  • 3
  • 4

0 Answers0