Compare rows within group of size two

Question

I've got the below code that works beautifully for comparing rows in a group when the first row doesnt matter.

data want_Find_Change;
    set WORK.IA;
    by ID;

    array var[*]  $      RATING;
    array lagvar[*]  $   zRATING;
    array changeflag[*] RATING_UPDATE;

    do i = 1 to dim(var);
        lagvar[i] = lag(var[i]);
    end;

    do i = 1 to dim(var) ;
        changeflag[i] = (var[i] NE lagvar[i] AND NOT first.ID);
    end;

   drop i;
run;

Unfortunately, when I use a dataset that has two rows per group I get incorrect returns, I'm assuming because the first row has to be used in the comparison. How can I compare the only to rows and a return only on the second row. This did not work:

data Change;
    set WORK.Two;
    by ID;
    changeflag = last.RATING NE first.RATING;
run;

Example of the data I have and want

 Group    Name     Sport     DogName   Eligibility 
   1      Tom      BBALL     Toto        Yes
   1      Tom      golf      spot        Yes
   2     Nancy     vllyball  Jimmy       yes
   2     Nancy     vllyball  rover       no

want

  Group    Name     Sport     DogName   Eligibility N_change  S_change  D_Change  E_change
     1      Tom      BBall     Toto        Yes      0           0        0       0
     1      Tom      golf      spot        Yes      0           1        1       0
     2     Nancy     vllyball  Jimmy       yes      0           0        0       0
     2     Nancy     vllyball  rover       no       0           0        1       1

Please illustrate your question with data instead of just code. — Parfait, Aug 06 '20 at 19:14
Your code is a data step. I don't see that this has anything to do with SQL, so I removed the tag. — Gordon Linoff, Aug 06 '20 at 19:15
Using the code provided and adapting it to your `want` dataset, I am able to get the desired results. — Stu Sztukowski, Aug 06 '20 at 20:11

score 0 · Answer 1 · answered Oct 19 '20 at 19:41

If you want only the first row to not be flagged, you first need to create a variable enumerating the rows within each group. You can do so with:

data temp;
  set have;
  count + 1;
  by Group;
  if first.Group then count = 1;
run;

In a second step, you can run a proc sql with a subquery, count distinct by groups, and case when:

proc sql;
create table want as
select 
Group, Name, Sport, DogName, Eligibility, 
case when count_name > 1 and count > 1 then 1 else 0 end as N_change,
case when count_sport > 1 and count > 1 then 1 else 0 end as S_change,
case when count_dog > 1 and count > 1 then 1 else 0 end as D_change,
case when count_E > 1 and count > 1 then 1 else 0 end as E_change
from (select *,
      count(distinct(Name)) as count_name,
      count(distinct(Sport)) as count_sport, 
      count(distinct(DogName)) as count_dog, 
      count(distinct(Eligibility)) as count_E
      from temp
      group by Group);
quit;

Best,

Compare rows within group of size two

1 Answers1