2

Trying to create a table that displays the most frequent ndc# for a given patient(pat_seqno). My having clause have not worked thus far and this is my most recent attempt, which returns an error that ndc_count is not found in the contributing tables. Thanks for the help

proc sql;
create table unique_rx_count as
select pat_seqno , ndc_seqno, ndc_count
from d1
where ndc_seqno in 
    (select count(ndc_seqno) as ndc_count
    from d1
    group by pat_seqno)
group by pat_seqno
having ndc_count =  max(ndc_count)
   ;
quit;

example: Pat_seqno ndc_seqno 2 45 2 45 2 23 2 45 16 10 16 10 16 78

return expected Pat_seqno ndc_seqno 2 45 16 10

user2448666
  • 329
  • 1
  • 6
  • 14

3 Answers3

1

This is a quick and dirty solution, but it work. I first break it down into three seperate sql statements:

*count per patient/ndc;
proc sql; 
 create table step1 as 
 select pat_seqno, ndc_seqno, count(*) as ndc_count
 from d1
 group by pat_seqno, ndc_seqno
 ;
quit;

* maxcount per patient;
proc sql;      
 create table step2 as
 select pat_seqno, max(ndc_count) as ndc_count
 from step1
 group by pat_seqno
 ;
quit;

*join count and maxcount;
proc sql;      
 create table want as
 select t1.*
 from step1 t1
 inner join step2 t2
 on t1.pat_seqno = t2.pat_seqno
 and t1.ndc_count = t2.ndc_count
 ;
quit;

and if you want you can combine it into a single SQL statement

proc sql;
 create table want as
 select t1.*
 from 
 (
  select pat_seqno, ndc_seqno, count(*) as ndc_count
  from d1
  group by pat_seqno, ndc_seqno
 ) t1
 inner join (
  select pat_seqno, max(ndc_count) as ndc_count
  from  (
   select pat_seqno, ndc_seqno, count(*) as ndc_count
   from d1
   group by pat_seqno, ndc_seqno
  )
 group by pat_seqno
 ) t2
 on t1.pat_seqno = t2.pat_seqno
 and t1.ndc_count = t2.ndc_count
 ;
 quit;
Laurent de Walick
  • 2,154
  • 14
  • 11
1
proc sql;
create table unique_rx_count as

select a.pat_seqno, a.ndc_seqno
from d1 a
group by a.pat_seqno, a.ndc_seqno
having count(*)=
        (select max(count)
        from
           (select c.pat_seqno, c.ndc_seqno, count(c.ndc_seqno) as count
           from d1 c
           group by c.pat_seqno, c.ndc_seqno)
        where c.pat_seqno=a.pat_seqno);
quit;
Dmitry Shopin
  • 1,753
  • 10
  • 11
0

Maybe something like this:

SELECT a.pat_seqno, a.ndc_seqno, MAX(b.ndc_count)
FROM d1 a
INNER JOIN (
  SELECT pat_seqno, ndc_seqno, COUNT(*) AS ndc_count
  FROM d1
  GROUP BY pat_seqno, ndc_seqno
  ) b ON b.pat_seqno = a.pat_seqno AND a.ndc_seqno = b.ndc_seqno
GROUP BY a.pat_seqno, a.ndc_seqno
Filipe Silva
  • 21,189
  • 5
  • 53
  • 68