I have a table recording the start time and end time of events of interest:
CREATE TABLE event_log (start_time DATETIME, end_time DATETIME);
INSERT INTO event_log VALUES ("2013-06-03 09:00:00","2013-06-03 09:00:05"), ("2013-06-03 09:00:03","2013-06-03 09:00:07"), ("2013-06-03 09:00:10","2013-06-03 09:00:12");
+---------------------+---------------------+
| start_time | end_time |
+---------------------+---------------------+
| 2013-06-03 09:00:00 | 2013-06-03 09:00:05 |
| 2013-06-03 09:00:03 | 2013-06-03 09:00:07 |
| 2013-06-03 09:00:10 | 2013-06-03 09:00:12 |
+---------------------+---------------------+
I am looking for a way to create a "time series" table where one column is a time index and another column is the count of events in progress at that time. I can do it with a subquery and a generator:
SET @first_time := (SELECT MIN(start_time) FROM event_log);
SET @last_time := (SELECT MAX(end_time) FROM event_log);
CREATE OR REPLACE VIEW generator_16
AS SELECT 0 n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL
SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL
SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL
SELECT 15;
CREATE TABLE time_series (t DATETIME, event_count INT(11))
SELECT @first_time + INTERVAL n SECOND t, NULL AS event_count
FROM generator_16
WHERE @first_time + INTERVAL n SECOND <= @last_time;
UPDATE time_series
SET event_count= (SELECT COUNT(*) FROM event_log
WHERE start_time<=t AND end_time>=t);
+---------------------+-------------+
| t | event_count |
+---------------------+-------------+
| 2013-06-03 09:00:00 | 1 |
| 2013-06-03 09:00:01 | 1 |
| 2013-06-03 09:00:02 | 1 |
| 2013-06-03 09:00:03 | 2 |
| 2013-06-03 09:00:04 | 2 |
| 2013-06-03 09:00:05 | 2 |
| 2013-06-03 09:00:06 | 1 |
| 2013-06-03 09:00:07 | 1 |
| 2013-06-03 09:00:08 | 0 |
| 2013-06-03 09:00:09 | 0 |
| 2013-06-03 09:00:10 | 1 |
| 2013-06-03 09:00:11 | 1 |
| 2013-06-03 09:00:12 | 1 |
+---------------------+-------------+
Is there a more efficient way to do it? This method requires a subquery for every time index. Would there, for example, be a way to do it that requires one subquery per "event_log" record? My real problem has 500k time index entries and 1k events; it's taking a little longer than I would like (about 90 seconds).
The "generator" snippet came from http://use-the-index-luke.com/blog/2011-07-30/mysql-row-generator . Clearly one of the larger generators, like the 64k version or the 1M version, would be needed for larger problems.