I'm trying to get a running total using a Subquery. (I'm using Metabase, which doesn't seem to accept/process variables in queries)
My Query:
SELECT date_format(t.`session_stop`, '%d') AS `session_stop`,
sum(t.`energy_used` / 1000) AS `csum`,
(
SELECT (SUM(a.`energy_used`) / 1000)
FROM `sessions` a
WHERE date_format(a.`session_stop`, '%Y-%m-%d') <= date_format(t.`session_stop`, '%Y-%m-%d')
AND str_to_date(concat(date_format(a.`session_stop`, '%Y-%m'), '-01'), '%Y-%m-%d') = str_to_date(concat(date_format(now(), '%Y-%m'), '-01'), '%Y-%m-%d')
ORDER BY str_to_date(date_format(a.`session_stop`, '%e'), '%d') ASC
) AS `sum`
FROM `sessions` t
WHERE str_to_date(concat(date_format(t.`session_stop`, '%Y-%m'), '-01'), '%Y-%m-%d') = str_to_date(concat(date_format(now(), '%Y-%m'), '-01'), '%Y-%m-%d')
GROUP BY date_format(t.`session_stop`, '%e')
ORDER BY str_to_date(date_format(t.`session_stop`, '%d'), '%d') ASC;
This takes about 1.29secs to run. (43K rows in total, returns 14)
If I remove the sum(t.`energy_used` / 1000) AS `csum`,
line, the query takes up 8 mins and 40 secs.
Why is this? I'd rather not have that line, but I also can't wait 8mins for a query to process.
(I know I can create a cumulative column, but I'm especially interested why this additional sum()
speeds the whole query up)
ps. tested this on both the MySQL console and the Metabase interface.
EXPLAIN query:
+----+--------------------+-------+------+---------------+------+---------+------+-------+---------------------------
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+--------------------+-------+------+---------------+------+---------+------+-------+---------------------------
| 1 | PRIMARY | t | ALL | NULL | NULL | NULL | NULL | 42055 | Using where; Using tempora
| 2 | DEPENDENT SUBQUERY | a | ALL | NULL | NULL | NULL | NULL | 42055 | Using where
+----+--------------------+-------+------+---------------+------+---------+------+-------+---------------------------
2 rows in set (0.00 sec)
Without the extra sum()
:
+----+--------------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
| 1 | PRIMARY | t | ALL | NULL | NULL | NULL | NULL | 44976 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | a | ALL | NULL | NULL | NULL | NULL | 44976 | Using where |
+----+--------------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
Schema is not much more than a table with:
session_id (INT, auto incr., prim.key) | session_stop (datetime) | energy_used (INT) |
1 | 1-1-2016 10:00:00 | 123456 |
2 | 1-1-2016 10:05:00 | 123456 |
3 | 1-2-2016 10:10:00 | 123456 |
4 | 1-2-2016 12:00:00 | 123456 |
5 | 3-3-2016 14:05:00 | 123456 |
Some examples on the internets show using the ID for the WHERE-clause, but I had some poor results with this.