Sequences in SAS Tables

Question

I'm looking to add a sequence column to my sas dataset, but according to ids and transaction dates. To illustrate, below is the table I'm referring to:

ID   |   TXN_DT    |    
01   |  01JAN2020  |    
01   |  01JAN2020  |    
01   |  02JAN2020  |    
01   |  03JAN2020  |    
02   |  01JAN2020  |    
02   |  02JAN2020  |    
02   |  02JAN2020  |    
02   |  03JAN2020  |    
02   |  03JAN2020  |

and I want to add a sequence like so:

ID   |   TXN_DT    |   SEQ  | 
01   |  01JAN2020  |    1   |
01   |  01JAN2020  |    1   |
01   |  02JAN2020  |    2   |
01   |  03JAN2020  |    3   |
02   |  01JAN2020  |    1   |
02   |  02JAN2020  |    2   |
02   |  02JAN2020  |    2   |
02   |  03JAN2020  |    3   |
02   |  03JAN2020  |    3   |

I'm trying to run the following code, but it seems to jump a row up and not copying the previous' row's value, and instead skips to 2 rows above.

data want;
set have;
by id;
if first.id then seq=1;
else seq+1;
if txn_dt=lag(txn_dt) then seq = lag(seq);
keep id seq txn_dt;
run;

any help? Thanks in advance!

I would probably do it slightly differently and only increment `seq` when `txn_dt ^= lag(txn_dt)` — mjsqu, Feb 14 '20 at 01:58

score 1 · Answer 1 · answered Feb 14 '20 at 02:32

1

Try

if first.id then seq=0;
seq + (first.id or txn_dt ne lag(txn_dt);

answered Feb 14 '20 at 02:32

Richard

25,390
3
25
38

score 0 · Answer 2 · answered Feb 14 '20 at 08:54

Try to use retain and first.

data want(drop=txn_dt_group);
    set have;
    by id txn_dt;
    retain txn_dt_group seq;
    if first.id then do;
        txn_dt_group=txn_dt;
        seq=1;
    end;
    if txn_dt ne txn_dt_group then do;
        seq=seq+1;
        txn_dt_group=txn_dt;
    end;    
run;

Output:

+-----------+----+-----+
|  txn_dt   | ID | seq |
+-----------+----+-----+
| 01JAN2020 |  1 |   1 |
| 01JAN2020 |  1 |   1 |
| 02JAN2020 |  1 |   2 |
| 03JAN2020 |  1 |   3 |
| 01JAN2020 |  2 |   1 |
| 02JAN2020 |  2 |   2 |
| 02JAN2020 |  2 |   2 |
| 03JAN2020 |  2 |   3 |
| 03JAN2020 |  2 |   3 |
+-----------+----+-----+

score 0 · Answer 3 · answered Feb 14 '20 at 19:11

0

data want;
set have;
by id txn_dt;
if first.id then seq=1;
else if first.txn_dt then seq+1;
run;

I think that should do it.

answered Feb 14 '20 at 19:11

Jonathan Wilson

626
6
14

score 0 · Answer 4 · answered Feb 18 '20 at 11:26

For completeness, here is a hash solution that does not depend on the order of your data.

data have;
input ID $ TXN_DT :date9.;
infile datalines dlm='|';
format TXN_DT date9.;
datalines;
01|01JAN2020
01|01JAN2020
01|02JAN2020
01|03JAN2020
02|01JAN2020
02|02JAN2020
02|02JAN2020
02|03JAN2020
02|03JAN2020
;

data want(drop=rc);
   if _N_ = 1 then do;
      dcl hash h1 ();
      h1.definekey ('ID', 'TXN_DT');
      h1.definedata ('SEQ');
      h1.definedone ();
      dcl hash h2 ();
      h2.definekey ('ID');
      h2.definedata ('SEQ');
      h2.definedone ();

      do until (lr);
         set have end=lr;

         if h2.find() = 0 then do;
            if h1.check() ne 0 then seq + 1;
         end;
         else seq = 1;

         h1.ref();
         h2.replace();
      end;
   end;

   set have;
   rc = h1.find();
run;

Sequences in SAS Tables

4 Answers4