0

I have a query which will perform joins over 6 tables and fetches various columns based on a condition. I want to add an extra filter condition which will give me only those members who have a count(distinct dateCaptured)>30. I'm able to get the list of members who satisfy this condition using Group by and having. But I don't want to group by other column names because of this one condition. Do I need to use PARTITION BY in this case.

Sample TABLE a

+-----+------------+--------------+
| Id  | Identifier | DateCaptured |
+-----+------------+--------------+
| 1   |      05548 | 2017-09-01   |
| 2   |      05548 | 2017-09-01   |
| 3   |      05548 | 2017-09-01   |
| 4   |      05548 | 2017-09-02   |
| 5   |      05548 | 2017-09-03   |
| 6   |      05548 | 2017-09-04   |
| 7   |      37348 | 2017-08-15   |
| 8   |      37348 | 2017-08-15   |
| .   |            |              |
| .   |            |              |
| .   |            |              |
| 54  |      37348 | 2017-10-15   |
+-----+------------+--------------+

Query

SELECT  a.value,
        b.value, c.value,
        d.value
        FROM Table a
    INNER JOIN Table b on a.Id=b.id
    INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
    INNER JOIN Table d on a.Id=d.Id 

Assume Table a has more than 30 records for Identifier 37348. How can I get only this Identifier for the above query.

These are the patients i'm interested in for the above SELECT.

SELECT  a.Identifier,count(DISTINCT DateCaptured)
    FROM Table a
    INNER JOIN Table b on a.Id=b.id
    INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
    INNER JOIN Table d on a.Id=d.Id 
    GROUP BY Identifier
    HAVING count(DISTINCT DateCaptured)>30
shockwave
  • 3,074
  • 9
  • 35
  • 60
  • ....it depends on what you want your output to look like, and what conditions you start with. You could certainly just aggregate over whatever table has `DateCaptured`, but it's not clear if you need to also reduce counted rows first (ie, must also have a reference in some other table). A windowed count is primarily useful if, in addition to counting, you're also displaying every single date. – Clockwork-Muse Oct 24 '17 at 20:42
  • What version of SQL server are you using? – Sparrow Oct 24 '17 at 20:44
  • @Clockwork-Muse The same output as the first query. count( distinct dateCaptured)>30 is just a filter condition which will consider only those `Identifier` who have more than 30 distinct dates – shockwave Oct 24 '17 at 20:50
  • @Sparrow 2008 R2 – shockwave Oct 24 '17 at 20:51

3 Answers3

1
WITH cte as (
    SELECT  a.Identifier
    FROM Table a
    INNER JOIN Table b on a.Id=b.id
    INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
    INNER JOIN Table d on a.Id=d.Id 
    GROUP BY Identifier
    HAVING count(DISTINCT DateCaptured) > 30
)
SELECT  a.value,
        b.value, c.value,
        d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id 
INNER JOIN cte on cte.Identifier = a.Identifier
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
1
SELECT  a.value,
        b.value, c.value,
        d.value
        FROM Table a
    INNER JOIN Table b on a.Id=b.id
    INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
    INNER JOIN Table d on a.Id=d.Id 
WHERE a.Identifier IN (SELECT  a1.Identifier  
    FROM Table a1
    GROUP BY a1.Identifier HAVING count(DISTINCT a1.DateCaptured)>30)
Abdullah Dibas
  • 1,499
  • 1
  • 9
  • 13
1

If the multiple rows really are in tableA, then you can do:

SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, COUNT(*) OVER (PARTITION BY id) as cnt
      FROM a
     ) a INNER JOIN
     b 
     ON a.Id = b.id INNER JOIN 
     c 
     ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN 
     d 
     ON a.Id = d.Id
WHERE a.cnt > 30;

Note: If you still need count(distinct) you can do:

SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, SUM(CASE WHEN seqnum = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY id) as cnt
      FROM (SELECT a.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY DateCaptured) as seqnum
            FROM a
           ) a
     ) a INNER JOIN
     b 
     ON a.Id = b.id INNER JOIN 
     c 
     ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN 
     d 
     ON a.Id = d.Id
WHERE a.cnt > 30;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786