We have two devices collecting data at roughly 30-second intervals. The devices are located at two widely-spaced sites. The absolute time of each collection for each site can vary +/- 30 seconds. Occasionally, a site will go offline for various reasons. The data from each device represents a different kind of measurement, e.g. temperature from device1 and humidity from device2. A process records the data from device1 and device2 into separate tables in a SQL Server 2012 Express database running on a server separate from each device.
It is desired to present the data from both devices correlated into records which will contain columns with the value for site1 for a particular date/time, combined with the data for site2 if any is available. User programs will then request recordsets for a specified date/time range. To this end, I constructed the following SP:
ALTER PROCEDURE [db_datareader].[DataJoinDateRange]
@DateFrom DateTime = '2014-05-15 15:10:24.000',
@DateTo DateTime = '2014-06-15 15:10:24.000'
AS
BEGIN
SET NOCOUNT ON;
WITH site1(id, date_time, dataval)
AS
(
SELECT *
FROM site1_data
WHERE site1_data.date_time BETWEEN @DateFrom AND @DateTo
),
site2(id, date_time, datavaql)
AS
(
SELECT *
FROM site2_data
WHERE site2_data.date_time BETWEEN @DateFrom AND @DateTo
)
SELECT * from site1 site1_res
INNER JOIN (select id, date_time, data_val) site2_res
on ABS(DATEDIFF("SECOND", site1_res.date_time, site_2_res.date_time)) < 30
END
The intent is to first select out records in the desired date/time range, and then join records from site1 to those in site2 which are within the +/- 30 sec. range. The resulting recordset will contain data from both devices, or nulls when no corresponding record exists.
This seems to work: records with the desired form are output and correspond to the correct records in each table. But the execution is very slow. A query over a date range of a few weeks takes about 1 minute and 30 seconds. Site1 contains about 5000 records in this date range, while Site2 contains only 1 record. A SELECT query on the date range only for each table executes in well under a second.
I have never delved very deeply into SQL before, but our small group has no one else at this time to do this task. Can anyone give me an idea of the proper way to do this, or at least how to accelerate this SP?