Select entire partition where max row in partition is greater than 1

Question

I'm partitioning by some non unique identifier, but I'm only concerned in the partitions with at least two results. What would be the way to get out all the instances where there's exactly one of the specified identifier?

Query I'm using:

SELECT ROW_NUMBER() OVER
        (PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
    ,nonUniqueId
    ,aTimeStamp
FROM myTable

What I'm getting:

row | nonUniqueId | aTimeStamp
---------------------------------
1   | 1234        | 2014-10-08...
2   | 1234        | 2014-10-09...
1   | 1235        | 2014-10-08...
1   | 1236        | 2014-10-08...
2   | 1236        | 2014-10-09...

What I want:

row | nonUniqueId | aTimeStamp
---------------------------------
1   | 1234        | 2014-10-08...
2   | 1234        | 2014-10-09...
1   | 1236        | 2014-10-08...
2   | 1236        | 2014-10-09...

Thanks for any direction :)

are the `aTimeStamp` values exactly the same that you want to group on? If its by day that you want to group on, then cast the `aTimeStamp` to an 8 character varchar or a date value dropping the time. — crthompson, Oct 09 '14 at 17:29
They are not. The time stamps are actually what get added to the non-unique ids to make them unique, they can be seconds to minutes to hours apart, but its impossible to predict as its based on user input. Just left the column there to show I wanted more than just the ids. — charles, Oct 09 '14 at 17:31
The group by is working just fine, that's not the issue. I'm trying to figure out how to remove the line: `1 | 1235 | 2014-10-08...` given only that its maximum value for `row` is 1. — charles, Oct 09 '14 at 17:36

Bacon Bits · Accepted Answer · 2014-10-09T18:27:46.967

Based on syntax, I'm assuming this is SQL Server 2005 or higher. My answer will be meant for that.

You have a couple options.

One, use a CTE:

;WITH CTE AS (
    SELECT ROW_NUMBER() OVER
            (PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
        ,nonUniqueId
        ,aTimeStamp
    FROM myTable
)
SELECT *
FROM CTE t
WHERE EXISTS (SELECT 1 FROM CTE WHERE row = 2 and nonUniqueId = t.nonUniqueId);

Or, you can use subqueries:

SELECT ROW_NUMBER() OVER
        (PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
    ,nonUniqueId
    ,aTimeStamp
FROM myTable t
WHERE EXISTS (SELECT 1 FROM myTable 
        WHERE nonUniqueId = t.nonUniqueId GROUP BY nonUniqueId, aTimeStamp HAVING COUNT(*) >= 2);

I removed the Group By statement. Left it in there from before I added the row column, and didn't realize it did nothing as a result. That said, this is the answer; however, I'm using Netezza (left that in tags). — charles, Oct 09 '14 at 17:49

Select entire partition where max row in partition is greater than 1

1 Answers1