18

I am using this code: (from this question: How to get the last record per group in SQL substituting my own columns)

WITH e AS
(
 SELECT *,
     ROW_NUMBER() OVER
     (
         PARTITION BY ApplicationId
         ORDER BY theDate DESC
     ) AS Recency
 FROM [Event]
)
SELECT *
FROM e
WHERE Recency = 1

Is it possible to 'partition' only if two fields are the same? For example I have data like this:

ID      Name    theDate
123     John    01/01/2012
123     John    01/02/2012
123     Doe     01/01/2012
456     Smith   02/04/2012
789     Smith   02/01/2012
789     Smith   02/09/2012
789     Roger   02/08/2012

From that data I'd want to return:

ID      Name    theDate
123     John    01/02/2012
123     Doe     01/01/2012
456     Smith   02/04/2012
789     Smith   02/09/2012
789     Roger   02/08/2012

Thanks for any help.

Thomas

Community
  • 1
  • 1
tsdexter
  • 2,911
  • 4
  • 36
  • 59

2 Answers2

46

You can have several columns separated by a comma

WITH e AS 
( 
 SELECT *, 
     ROW_NUMBER() OVER 
     ( 
         PARTITION BY ApplicationId , Name
         ORDER BY theDate DESC 
     ) AS Recency 
 FROM [Event] 
) 
SELECT * 
FROM e 
WHERE Recency = 1 
JeffO
  • 7,957
  • 3
  • 44
  • 53
  • Have you tried that? The documentation of Over( http://msdn.microsoft.com/en-us/library/ms189461.aspx) is not clear about how many columns you can specify. – 000 Apr 18 '12 at 01:57
  • This doesn't work as intended. It still partitions ALL the ones with the same ID and then I suppose further partitions that based on the second column. See my answer I added. – tsdexter Apr 18 '12 at 02:08
  • 4
    +1. This works exactly as intended and provides the exact answer you asked for in the question. If it doesn't, you need to explain how. I suspect you haven't actually tested it because your description of how it didn't work does not match what it actually does. – Aaron Bertrand Apr 18 '12 at 02:37
  • 2
    I concur with the comments from @AaronBertrand. I have used PARTITION BY with multiple columns on many occasions. The example by JeffO is exactly what I have used. – Steve Stedman Apr 18 '12 at 02:44
5

I've found it the answer here: Table partitioning using 2 columns

You can only partition on 1 column, however that column can be generated to make a 'multiple partition' like so:

WITH e AS 
( 
 SELECT *, 
 ROW_NUMBER() OVER 
 ( 
     PARTITION BY CONVERT(VARCHAR(100),ApplicationId) + ' ' + Name
     ORDER BY theDate DESC 
 ) AS Recency 
 FROM [Event] 
) 
SELECT * 
FROM e 
WHERE Recency = 1 

Adding the two columns together as one single string ensures it will only partition if both columns are identical.

Community
  • 1
  • 1
tsdexter
  • 2,911
  • 4
  • 36
  • 59
  • 3
    I think you are confusing `table partitioning` with `PARTITION BY` - while they are named similarly, they are not related in any way. You certainly are not limited to one column or expression in `PARTITION BY` and there is no need for your so-called 'multiple partition'... – Aaron Bertrand Apr 18 '12 at 02:24
  • 1
    @AaronBertrand yes I realize that the link I posted is about table partioning and I'm using PARTITION BY however, what the guy who answered that question said applies here too. The suggestion above (field1, field2) doesn't work to return the data how I need it, whereas the code I supplied base don the answer in the link (Field1 + ' ' + Field2) does provide the proper data... Is this not the correct way to do it? – tsdexter Apr 18 '12 at 02:28
  • 1
    @AaronBertrand I didn't fully test the results but I think PARTITION BY field1, field2 first partition on the first field and then further partitions that result on the second field, is that correct? That's not what I need. – tsdexter Apr 18 '12 at 02:30
  • 1
    You need to test the results before you start making assumptions about what it will or will not do. Did you try @Jeff O's answer? Can you explain how it didn't work as expected? I tried his code using your sample data and it gave exactly the desired results you asked for (albeit not in the right order). – Aaron Bertrand Apr 18 '12 at 02:36
  • 1
    Hmm. My mistake. It must be something else that is causing the difference but it did not work with my real data which I did not post as is confidential to my employer. That's my mistake for not posting accurate data/columns. I'm not sure where the difference is coming from in that case, but I did test it with my data and it didn't work right. – tsdexter Apr 18 '12 at 02:59