1

I am trying to analyze a dataset where each subject has 12 repeated measures (quarterly over 3 years). I want to extract subject specific estimates of the time slope to evaluate if the subjects are changing significantly over time.

The code I currently have consistently suggests that each subject is demonstrating a highly significant increase over time. This seems unlikely but I'm not sure how to adjust my syntax to run a more accurate model. Does anyone know how/why this model would find the slope coefficient for time significant for all cases?

A quick description of the study: We are creating a trending report which should flag procedure codes (subjects) that are showing a significant increase in the number of times it was billed over the time period being analyzed (3 years, by quarter). The outcome variable is being treated as a count (bounded at 0 but not necessarily whole numbers).

%macro Zeroes(numzeroes);
   %local i;
   %do i = 1 %to %eval(&numzeroes-1);
      0
   %end;
   1;
%mend;

%macro EstimateStatement(numsubjects=);
   %local i;

   proc glimmix data=procdata11;
      class code;
      model billing_count=period_count / dist=NB link=log
      solution ddfm=betwithin;
      random intercept period_count / sub=code type=AR(1);
      random _residual_;  
      %do i = 1 %to &numsubjects;
         estimate "Slope for Code &i" period_count 1 | period_count 1 / subject %Zeroes(&i);
       %end;
      ods output estimates=sscoeff;
    run;
 %mend;

 %EstimateStatement(numsubjects=&num_codes)

Any help on making this model more accurate and efficient would be greatly appreciated!

Thanks!

andrey_sz
  • 751
  • 1
  • 13
  • 29
aisley
  • 11
  • 3

1 Answers1

0

Maybe the positive slope is an actual feature of the data? What do you see if you plot billing_count versus period_count for each code?

Regarding the program, I have two suggestions.

(1) The use of type=AR(1) in

random intercept period_count / sub=code type=AR(1);

forces the variance of the intercepts to be equal to the variance of the slopes. This constraint may be inconsistent with the data. AR(1) is not a sensible covariance structure for a random coefficients model. Try type=UN or type=UN(1).

(2) Drop

random _residual_;

Its inclusion makes the model overspecified; the negative binomial distribution already has a scale parameter.

Another thing to consider is that a random coefficients model produces shrinkage estimators, such that estimates for individual codes are shrunk toward the overall solution: the estimates of slope that you obtain from the random coefficients model will not be equal to the estimates you would obtain from separate regressions for each code. Kreft et al. have a nicely intuitive presentation of this topic (see p14 here http://tinyurl.com/ns99ojh).

user20489
  • 21
  • 3