2

First time post so hopefully someone can kindly assist on this problem I'm facing within SAS EG (still learning SAS coding so please be kind!)

If you see a snippet of the dataset below what I'm trying to do is tally up the scores (pts) by Ref based on consecutive occurrences that flag has showed for that Ref.

For Example: If you take Ref 505 for A_Flag there is 2 different sets of consecutive occurrences of that flag then scoring will be as follows:

  • 1st ID > 1st instance = 25 points
  • 2nd ID > 2nd instance but 1st consecutive instance = double to 50 points
  • 3rd ID > 0 instance = 0 points
  • 4th ID > 1st instance = 25 points
  • 5th ID > 2nd instance but 1st consecutive instance = double to 50 points
  • 6th ID > 0 instance = 0 points

Therefore for this Ref A_Pts will be 150 points.

Another example: If you take Ref 527 for B_Flag there is 4 consecutive occurrences of that flag so coring per ID:

  • 1st ID > 0 instance = 0 points
  • 2nd ID > 1st instance = 10 points
  • 3rd ID > 2nd instance but 1st consecutive instance = double to 20 points
  • 4th ID > 3rd instance but 2nd consecutive instance = double to 40 points
  • 5th ID > 4th instance but 3rd consecutive instance = double to 80 points

Therefore for this Ref B_Pts will be 150 points

I have to say the data is in the necessary order for what I'm trying to achieve.

I'd tried using LAG function but that will only work based on the 1st consecutive instance.

I also tried calculate a count - an enumeration variable based on cats(Ref,A_Flag) - but it then orders the data incorrectly and doesnt count up accordingly

Hopefully this makes sense to someone out there!

The dataset in question:

+-----------+-----+--------+--------+--------+-------+-------+
|   date    | Ref | FormID | A_Flag | B_Flag | A_Pts | B_Pts |
+-----------+-----+--------+--------+--------+-------+-------+
| 01-Feb-17 | 505 |  74549 | A      |        |    25 |     0 |
| 01-Feb-17 | 505 |  74550 | A      |        |    25 |     0 |
| 10-Jan-17 | 505 |  82900 |        | B      |     0 |    10 |
| 13-Jan-17 | 505 |  82906 | A      |        |    25 |     0 |
| 09-Jan-17 | 505 |  82907 | A      |        |    25 |     0 |
| 11-Jan-17 | 505 |  82909 |        | B      |     0 |    10 |
| 03-Jan-17 | 527 |  62549 | A      |        |    25 |     0 |
| 04-Jan-17 | 527 |  62550 |        | B      |     0 |    10 |
| 04-Jan-17 | 527 |  76151 |        | B      |     0 |    10 |
| 04-Jan-17 | 527 |  76152 | A      | B      |    25 |    10 |
| 04-Jan-17 | 527 |  76153 | A      | B      |    25 |    10 |
+-----------+-----+--------+--------+--------+-------+-------+

Desired output (unless there is a better suggestion):

+-----------+-----+--------+--------+--------+-----------+-----------+
|   date    | Ref | FormID | A_Flag | B_Flag | A_Pts_Agg | B_Pts_Agg |
+-----------+-----+--------+--------+--------+-----------+-----------+
| 01-Feb-17 | 505 |  74549 | A      |        |        25 |         0 |
| 01-Feb-17 | 505 |  74550 | A      |        |        50 |         0 |
| 10-Jan-17 | 505 |  82900 |        | B      |         0 |        10 |
| 13-Jan-17 | 505 |  82906 | A      |        |        25 |         0 |
| 09-Jan-17 | 505 |  82907 | A      |        |        50 |         0 |
| 11-Jan-17 | 505 |  82909 |        | B      |         0 |        10 |
| 03-Jan-17 | 527 |  62549 | A      |        |        25 |         0 |
| 04-Jan-17 | 527 |  62550 |        | B      |         0 |        10 |
| 04-Jan-17 | 527 |  76151 |        | B      |         0 |        20 |
| 04-Jan-17 | 527 |  76152 | A      | B      |        25 |        40 |
| 04-Jan-17 | 527 |  76153 | A      | B      |        50 |        80 |
+-----------+-----+--------+--------+--------+-----------+-----------+

So when totalled up it'll be

+-----+-----------+-----------+
| Ref | A_Pts_Agg | B_Pts_Agg |
+-----+-----------+-----------+
| 505 |       150 |        20 |
| 527 |       100 |       150 |
+-----+-----------+-----------+
Rockshah
  • 77
  • 2
  • 11

1 Answers1

0

Try this:

data have;
infile cards dlm='|';
input date :date7. Ref :8. FormID :8. A_Flag :$1. B_Flag :$1. A_Pts :8.  B_Pts :8.;
format date date7.;
cards;
| 01-Feb-17 | 505 |  74549 | A      |        |    25 |     0 |
| 01-Feb-17 | 505 |  74550 | A      |        |    25 |     0 |
| 10-Jan-17 | 505 |  82900 |        | B      |     0 |    10 |
| 13-Jan-17 | 505 |  82906 | A      |        |    25 |     0 |
| 09-Jan-17 | 505 |  82907 | A      |        |    25 |     0 |
| 11-Jan-17 | 505 |  82909 |        | B      |     0 |    10 |
| 03-Jan-17 | 527 |  62549 | A      |        |    25 |     0 |
| 04-Jan-17 | 527 |  62550 |        | B      |     0 |    10 |
| 04-Jan-17 | 527 |  76151 |        | B      |     0 |    10 |
| 04-Jan-17 | 527 |  76152 | A      | B      |    25 |    10 |
| 04-Jan-17 | 527 |  76153 | A      | B      |    25 |    10 |
;
run;

data want;
  set have;
  by Ref;
  retain A_pts_agg B_pts_agg;
  if first.Ref then do;
    A_pts_agg = A_pts;
    B_pts_agg = B_pts;
  end;
  if lag(A_flag) ne (A_flag) then A_pts_agg = A_pts;
  else if A_flag = 'A' then A_pts_agg = A_pts_agg * 2;
  if lag(B_flag) ne (B_flag) then B_pts_agg = B_pts;
  else if B_flag = 'B' then B_pts_agg = B_pts_agg * 2;
run;
user667489
  • 9,501
  • 2
  • 24
  • 35