5

From the data below I need to select the record nearest to a specified date for each Linked ID using SQL Server 2005:

ID     Date      Linked ID
...........................
1    2010-09-02     25
2    2010-09-01     25
3    2010-09-08     39
4    2010-09-09     39
5    2010-09-10     39
6    2010-09-10     34
7    2010-09-29     34
8    2010-10-01     37
9    2010-10-02     36
10   2010-10-03     36

So selecting them using 01/10/2010 should return:

1    2010-09-02     25 
5    2010-09-10     39
7    2010-09-29     34 
8    2010-10-01     37
9    2010-10-02     36

I know this must be possible, but can't seem to get my head round it (must be too near the end of the day :P) If anyone can help or give me a gentle shove in the right direction it would be greatly appreciated!

EDIT: Also I have come across this sql to get the closest date:

abs(DATEDIFF(minute, Date_Column, '2010/10/01'))

but couldn't figure out how to incorporate into the query properly...

Thanks

Iain Ward
  • 9,850
  • 5
  • 34
  • 41
  • Add... you added in the edit after my comment. Ok I will write you the complete query. – Hogan Jan 10 '11 at 17:46
  • Your title is misleading, it is probably "select ID order by nearest date" – dvhh Jan 10 '11 at 17:54
  • your example using minute as the first param to datediff will probably not work (since you don't have time in your example dates.) You want to use day -- which can be abbr. as `day`, `dd`, or `d`. Interestingly all the answer use a different abbr. – Hogan Jan 10 '11 at 17:58
  • @dvhh No I dont think so, I'm describing what I'm trying to do and if you look at Hogans answer, that's exactly what he does – Iain Ward Jan 10 '11 at 18:05

3 Answers3

8

you can try this.

DECLARE @Date DATE = '10/01/2010';

WITH cte AS
    (
    SELECT ID, LinkedID, ABS(DATEDIFF(DD, @date, DATE)) diff,
        ROW_NUMBER() OVER (PARTITION BY LinkedID ORDER BY ABS(DATEDIFF(DD, @date, DATE))) AS SEQUENCE
    FROM MyTable
    )

SELECT *
FROM cte
WHERE SEQUENCE = 1
ORDER BY ID
;

You didn't indicate how you want to handle the case where multiple rows in a LinkedID group represent the closest to the target date. This solution will only include one row And, in this case you can't guarantee which row of the multiple valid values is included.

You can change ROW_NUMBER() with RANK() in the query if you want to include all rows that represent the closest value.

bobs
  • 21,844
  • 12
  • 67
  • 78
  • this is nicer than my query (only one select) but mine may be clearer to a beginner... – Hogan Jan 10 '11 at 17:56
  • More information on ROW_NUMBER() is available here: http://msdn.microsoft.com/en-us/library/ms186734.aspx – Sean Reilly Jan 10 '11 at 18:00
  • @Hogan - I'm not sure I agree. If you are going to use a CTE anyway, then you might as well take advantage of Row_Number(). – Thomas Jan 10 '11 at 18:09
  • Thanks for your answer. Duplicates are irrelevant, as long as it returns one thats all I need – Iain Ward Jan 10 '11 at 18:10
  • @Thomas : Yeah, if you seem my comment on my answer to @Sean I basically said as much (and voted for your answer.) – Hogan Jan 10 '11 at 18:12
4

You want to look at the absolute value of the DATEDIFF function (http://msdn.microsoft.com/en-us/library/ms189794.aspx) by days.

The query can look something like this (not tested)

with absDates as 
(
   select *, abs(DATEDIFF(day, Date_Column, '2010/10/01')) as days
   from table
), mdays as
( 
   select min(days) as mdays, linkedid
   from absDates
   group by linkedid
)
select * 
from absdates
inner join mdays on absdays.linkedid = mdays.linkedid and absdays.days = mdays.mdays
Hogan
  • 69,564
  • 10
  • 76
  • 117
  • @Hogan Ah yes I have come across it already, but forgot to mention that in the question, so I've updated it. Thanks for mentioning that – Iain Ward Jan 10 '11 at 17:45
  • @w69rdy : example query added. – Hogan Jan 10 '11 at 17:52
  • The example query might not be right -- it should be min(days), otherwise you would return the largest difference, right? Also, I don't think that the performance on this will be very good at all. In general I would recommend using the ROW_NUMBER() solution. It should be more straightforward and more performant. – Sean Reilly Jan 10 '11 at 17:56
  • @Hogan, your query fails right now. Where is table absdays defined? – bobs Jan 10 '11 at 18:00
  • @bobs : It wasn't :D I fixed the typo. – Hogan Jan 10 '11 at 18:01
  • @Sean : I changed the max to min -- and I agree that Row_Number() will be more performant. I don't agree that it is clearer since it is a MS only solution and not part of standard SQL. Of course that does ring hollow since I'm using CTEs. – Hogan Jan 10 '11 at 18:03
  • +1 @Hogan Thanks for your answer, it works but I think I prefer bobs as its a little bit slicker and removes duplicates :) – Iain Ward Jan 10 '11 at 18:09
  • @w69rdy : yep, it is a better TSQL answer. This is the way to do it if you don't have TSQL (and used temp tables instead of CTE – Hogan Jan 10 '11 at 18:14
  • @Hogan: it's worth nothing that if temp tables were used, performance probably wouldn't be a problem. I would say this might be the preferred solution for databases other than mssql 2005 and later. – Sean Reilly Jan 11 '11 at 06:59
  • @Sean : I agree unless that SQL includes functionally that allows for Ranking functionality. – Hogan Jan 11 '11 at 14:45
0

You can also try to do it with a subquery in the select statement:

select  [LinkedId],
        (select top 1 [Date] from [Table] where [LinkedId]=x.[LinkedId] order by abs(DATEDIFF(DAY,[Date],@date)))
from    [Table] X
group by [LinkedId]
pcofre
  • 3,976
  • 18
  • 27