0

Given Dataset:

data hello;
    input id value;
    cards;
    101 22
    101 44
    103 22
    104 22
    104 55
    106 22
    106 .
;
run;

I am trying to create a nvariable and Dtype by Id variable as follows:

Id  value Nvalue Dtype
101 22      
101 44      
          33    Average
103 22      
104 22      
104 55      
          38.5    Average
106 22      
106 .       
          22    LOCF

Is there any possible way to get the output as mentioned above.

user3893852
  • 95
  • 1
  • 8
  • Those aren't averages. Those appear to be sums. You also made an error at ID 103. Furthermore, what is your desired output if an ID has more than one recorded value and more than one missing value? – udden2903 Jul 18 '16 at 14:59
  • You seem to forget about ID 103. It has no average in your desired output table. I guess the average of 38 for ID 104 should be 38.5 as 22 + 55 = 77? – udden2903 Jul 18 '16 at 15:12
  • First of all, Thanks for correcting my mistake. I am trying to derive an output when an ID has more than one recorded value and also trying to perform LOCF operation whenever an ID have missing value. – user3893852 Jul 18 '16 at 15:12

1 Answers1

1

Here is my attempt. I added a few more observations to the example to show you the result when missing values occur in a more unpredictable pattern.

data have;
    input id value;
    cards;
    101 22
    101 44
    103 22
    104 22
    104 55
    106 22
    106 .
    107 25
    107 .
    107 22
    108 .
    108 .
    109 10
    109 12
;
run;

proc sql;
    create table averages as
        select id, avg(value) as nvalue 
            from have
                group by id;
quit;

data want (drop=missing);
    set have averages;
    by id;
    retain missing;

    if first.id then
        missing=.;

    if not last.id and value=. then
        missing=1;
    length dtype $10;

    if last.id then
        do;
            if missing=1 then
                dtype="LOCF";
            else dtype="Average";
        end;
run;
udden2903
  • 783
  • 6
  • 15