2

I have 3 tables.

Table 1 has 2 fields : Time/Value

Table 2 has 2 fields : Time/Value

Table 3 has 1 field : Time

I want to find, for each Time in Table 3, the Value in Table 1 and Table 2 which is the closest to the Time field.

Time precision in Table1 and 2 is around millisecond. Time precision in Table3 is 1 second.

Is it possible to do so in a SQL query without having to parse the tables with a loop myself?

user2164703
  • 199
  • 1
  • 11
  • yes, use a http://en.wikipedia.org/wiki/Join_(SQL). if still stuck show the code you already have. – mb21 Jun 15 '13 at 16:52
  • 1
    Can you please tag your question with the RDBMS you are using (MySQL, Oracle, SQL Server, Postgres etc). If this is Oracle, SQL Server or Postgres (or any other that supports windowing functions) then this will help: http://stackoverflow.com/questions/13929053/how-to-get-the-closest-dates-in-oracle-sql/13929416#13929416 – Ben Jun 15 '13 at 16:52
  • @mb21 - I don't think you read the question – Hogan Jun 15 '13 at 16:57

2 Answers2

4

You can do this in SQL but because of the tools available with SQL it will be slow, for each row of Table3 you have to look for all rows of Table1 and Table2. If you hand coded the solution in a more versatile language you could optimize for the problem domain. If you are only doing it as an ad-hoc request SQL will be the easiest to code and looks something like this:

SELECT t3.time,
  (SELECT TOP 1 t1.value
   FROM t1
   ORDER BY ABS(DATEDIFF(ms,t3.time,t1.time)) ASC
  ) as t1value,
  (SELECT TOP 1 t2.value
   FROM t2
   ORDER BY ABS(DATEDIFF(ms,t3.time,t2.time)) ASC
  ) as t2value
FROM t3

How this works

For each row in t3 do a select on both t1 and t2 ordered by the time difference and only take the smallest one.

A note about speed

This code will run in O(N3 * N1) + O(N3 * N2). If you hand coded a good algorithm you would be able to get O(N3 * log(N1)) + O(N3 * log(N2). (Because you could do a fast nearest search on the time).

Aleksandr Fedorenko
  • 16,594
  • 6
  • 37
  • 44
Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Thank you its what i was looking for. However as I have couple of hundred thousands entries in each table, the query is too long to process. I have no choice but to parse it one time and assign each timestamp on the run. But thanks, it allowed be to test accuracy on small parts of my data. – user2164703 Jun 16 '13 at 14:16
0

Here's one way of doing this:

select t3.Time
  , t1Time = case when abs(datediff(ms, t1Below.Time, t3.Time))
      <= abs(datediff(ms, t1Above.Time, t3.Time))
    then t1Below.Time
    else t1Above.Time
    end
  , t1Value = case when abs(datediff(ms, t1Below.Time, t3.Time))
      <= abs(datediff(ms, t1Above.Time, t3.Time))
    then t1Below.Value
    else t1Above.Value
    end
  , t2Time = case when abs(datediff(ms, t2Below.Time, t3.Time))
      <= abs(datediff(ms, t2Above.Time, t3.Time))
    then t2Below.Time
    else t2Above.Time
    end
  , t2Value = case when abs(datediff(ms, t2Below.Time, t3.Time))
      <= abs(datediff(ms, t2Above.Time, t3.Time))
    then t2Below.Value
    else t2Above.Value
    end
from t3
  outer apply (select top 1 t1Below.*
              from t1 t1Below
              where t3.Time >= t1Below.Time
              order by t1Below.Time desc) t1Below
  outer apply (select top 1 t1Above.*
              from t1 t1Above
              where t3.Time <= t1Above.Time
              order by t1Above.Time) t1Above
  outer apply (select top 1 t2Below.*
              from t2 t2Below
              where t3.Time >= t2Below.Time
              order by t2Below.Time desc) t2Below
  outer apply (select top 1 t2Above.*
              from t1 t2Above
              where t3.Time <= t2Above.Time
              order by t2Above.Time) t2Above

SQL Fiddle with demo.

This approach find the most recent t1 and t2 times/values both before and after each t3 times, then works out which one of the before/after rows to use in the select statement.

The advantage with this way is that SQL Server can uses indexes effectively to get the before/after values; the work done to work out which before/after value to use for each t3 time should be offset by efficient retrieval of the before/after values.

Ian Preston
  • 38,816
  • 8
  • 95
  • 92