1

I am trying to seek some validation, this may be trivial for most but I am by no means an expert at statistics. I am trying to select patients in the top 1% based on a score within each drug and location. The data would look something like this (on a much larger scale):

Patient    drug    place    score
John         a      TX        12
Steven       a      TX        10 
Jim          B      TX        9
Sara         B      TX        4   
Tony         B      TX        2
Megan        a      OK        20
Tom          a      OK        10
Phil         B      OK        9 
Karen        B      OK        2 

The code snipit I have written to calculate those top 1% patients is as follows: proc sql;

create table example as 
select *,
score/avg(score) as test_measure
from prior_table
group by drug, place
having test_measure>.99;
quit;

Does this achieve what I am trying to do, or am going about it all wrong? Sorry if this is really trivial to most. Thanks

bmb1020
  • 25
  • 2

1 Answers1

1

There are multiple ways to calculate and estimate a percentile. A simple way is to use PROC SUMMARY

proc summary data=have;
var score;
output out=pct p99=p99;
run;

This will create a data set named pct with a variable p99 containing the 99th percentile.

Then filter your table for values >=p99

proc sql noprint;
create table want as
select a.*
    from have as a
    where a.score >= (select p99 from pct);
quit;
DomPazz
  • 12,415
  • 17
  • 23