1

I've got the below code that works beautifully for comparing rows in a group when the first row doesnt matter.

data want_Find_Change;
    set WORK.IA;
    by ID;

    array var[*]  $      RATING;
    array lagvar[*]  $   zRATING;
    array changeflag[*] RATING_UPDATE;

    do i = 1 to dim(var);
        lagvar[i] = lag(var[i]);
    end;

    do i = 1 to dim(var) ;
        changeflag[i] = (var[i] NE lagvar[i] AND NOT first.ID);
    end;

   drop i;
run;

Unfortunately, when I use a dataset that has two rows per group I get incorrect returns, I'm assuming because the first row has to be used in the comparison. How can I compare the only to rows and a return only on the second row. This did not work:

data Change;
    set WORK.Two;
    by ID;
    changeflag = last.RATING NE first.RATING;
run;

Example of the data I have and want

 Group    Name     Sport     DogName   Eligibility 
   1      Tom      BBALL     Toto        Yes
   1      Tom      golf      spot        Yes
   2     Nancy     vllyball  Jimmy       yes
   2     Nancy     vllyball  rover       no

want

  Group    Name     Sport     DogName   Eligibility N_change  S_change  D_Change  E_change
     1      Tom      BBall     Toto        Yes      0           0        0       0
     1      Tom      golf      spot        Yes      0           1        1       0
     2     Nancy     vllyball  Jimmy       yes      0           0        0       0
     2     Nancy     vllyball  rover       no       0           0        1       1
user2448666
  • 329
  • 1
  • 6
  • 14

1 Answers1

0

If you want only the first row to not be flagged, you first need to create a variable enumerating the rows within each group. You can do so with:

data temp;
  set have;
  count + 1;
  by Group;
  if first.Group then count = 1;
run;

In a second step, you can run a proc sql with a subquery, count distinct by groups, and case when:

proc sql;
create table want as
select 
Group, Name, Sport, DogName, Eligibility, 
case when count_name > 1 and count > 1 then 1 else 0 end as N_change,
case when count_sport > 1 and count > 1 then 1 else 0 end as S_change,
case when count_dog > 1 and count > 1 then 1 else 0 end as D_change,
case when count_E > 1 and count > 1 then 1 else 0 end as E_change
from (select *,
      count(distinct(Name)) as count_name,
      count(distinct(Sport)) as count_sport, 
      count(distinct(DogName)) as count_dog, 
      count(distinct(Eligibility)) as count_E
      from temp
      group by Group);
quit;

Best,

LuizZ
  • 945
  • 2
  • 11
  • 23