-2

I have table in SAS Enterprise Guide like below.

Data types and meaning:

  • ID - numeric - ID of client
  • DT - date - date of change
  • OFFER_1 - charcter - current offer
  • OFFER_2 - character - offer after change

Values in original dataset are not sorted, but it can be sorted if it is important for solution.

ID   | DT        | OFFER_1  | OFFER_2
-----|-----------|----------|----------
123  | 01MAY2020 | PR       | PR
123  | 05MAY2020 | PR       | P
123  | 10MAY2020 | P        | P
123  | 11MAY2020 | P        | P
123  | 20MAY2020 | P        | PR
123  | 21MAY2020 | PR       | M
123  | 25MAY2020 | M        | M
777  | 30MAY2020 | PR       | M
223  | 02JAN2020 | PR       | PR
223  | 15MAR2020 | PR       | PR
402  | 20MAR2020 | M        | M
33   | 11AUG2020 | M        | PR
11   | 20JAN2020 | PR       | M
11   | 05FEB2020 | M        | M

And I need to create new column "COL1" where will be information:

  • If client change offer from PR to P or M then count how many maximum days he lasted before returning to PR again, or how many days he is already on P or M after the change, if he has not returned to PR
  • If he did not change PR to P or M "COL1" = 0

So, as a result I need something like below:

ID   | DT        | OFFER_1  | OFFER_2  | COL1
-----|-----------|----------|----------|---------
123  | 01MAY2020 | PR       | PR       | 15
123  | 05MAY2020 | PR       | P        | 15
123  | 10MAY2020 | P        | P        | 15
123  | 11MAY2020 | P        | P        | 15
123  | 20MAY2020 | P        | PR       | 15
123  | 21MAY2020 | PR       | M        | 15
123  | 25MAY2020 | M        | M        | 15
777  | 30MAY2020 | PR       | M        | 1
223  | 02JAN2020 | PR       | PR       | 0
223  | 15MAR2020 | PR       | PR       | 0
402  | 20MAR2020 | M        | M        | 0
33   | 11AUG2020 | M        | PR       | 0
11   | 20JAN2020 | PR       | M        | 16
11   | 05FEB2020 | M        | M        | 16

Because:

  • ID = 123, has 15 in "COL1" - because changed PR to P or M and lasted a maximum of 15 days (from 05MAY2020 till 20MAY2020) before return to PR again, changed PR to P or M 2 times but the 2nd time the change lasted 4 days and 15 > 4
  • ID = 777, has 1 in "COL1" - because changed PR to P or M and lasted a maximum of 1 day (30MAY2020)
  • ID = 223, has 0 in "COL1" - because he did not change PR to P or M
  • ID = 402, has 0 in "COL1" - because he did not change PR to P or M
  • ID = 33, has 0 in "COL1" - because he did not change PR to P or M
  • ID = 11, has 16 in "COL1" - because changed PR to P or M and lasted a maximum of 16 days (from 20JAN2020 till 05FEB2020) and did not return to PR again

How can I do that in SAS Enterprise Guide in PROC SQL or in normal SAS code ?

unbik
  • 178
  • 9
  • Again, don't duplicate questions, update original one. – Negdo Sep 22 '22 at 09:42
  • Oh, I'm sorry, I just didn't explain the problem accurately so the solution didn't solve the problem. Do you have any idea Negdo? – unbik Sep 22 '22 at 09:43
  • If you don't explain the problem accurately enough the first time just update the post. That is the point of edit function. – Negdo Sep 22 '22 at 09:46
  • ok I will keep it in mind, Negdo :), in the meantime do you have any idea how to approach the described problem? – unbik Sep 22 '22 at 10:01
  • does anyone has any idea ? – unbik Sep 22 '22 at 14:16
  • Does this answer your question? [How to check max difference in dates between events and create new column in SAS Enterprise Guide?](https://stackoverflow.com/questions/73806369/how-to-check-max-difference-in-dates-between-events-and-create-new-column-in-sas) – Stu Sztukowski Sep 22 '22 at 14:28
  • Stu Sztukowski, almost right but not quite. because using your modified code when I have offer_1 = P or M and offer_2 = P or M I should have 0 or NaN, but now you count difference between the first and last DT of such an ID. We are only interested in the maximum number of days the customer has endured after changing from PR to P or M if he has made several such changes, or simply the number of days how many he is already on such a change. If he changed P to P or M to M or PR to PR count NOTHING for such change :) Could you modify your answer, please? – unbik Sep 22 '22 at 15:11
  • Stu Sztukowski, as you can see for ID = 123, he made 2 changes 05MAY2020 and 21MAY2020, duration of the first change was 15 days (the he back to P) and duration of the second change was 4 days, so max is 15 :) do you understand ? :) – unbik Sep 22 '22 at 15:14

1 Answers1

0

I'll try to explain this steps by step.

First of all, we need to fetch all changes of offer, so I'd recommend filtering all rows, where OFFER_1 <> OFFER_2

so the resulting table (t2) would look like this (assuming we're only checking our logic on ID = 123

ID   | DT        | OFFER_1  | OFFER_2  | COL1
-----|-----------|----------|----------|---------
123  | 05MAY2020 | PR       | P        | 15
123  | 20MAY2020 | P        | PR       | 15
123  | 21MAY2020 | PR       | M        | 15

Then, I'd recommend joining our new table, t2, to the old one, t1, with joining order:

on t1.ID = t2.ID and t1.DT < t2.DT

calculate difference = t2.DT - t1.DT and leave max(difference)

Does that make sense to you?

Niqua
  • 386
  • 2
  • 15
  • Are you using `<>` to represent NOT EQUAL, like in SQL code. Or MAX like in SAS code? – Tom Sep 27 '22 at 12:40