I want to create a column "Cured" that takes value 1 when the customer left have1
either on the day or in the two following days after appearing in have2
. Data have1
is a dataset that defines when a client entered and left certain status. have2
shows when the client was contacted.
This is the code I made:
proc SQL;
create table want2 as
select b.*
, case when b.Outcome = "Answered" or
b.Outcome = "Answerphone Message Left" or
b.Outcome = "Answerphone Message Not Left" or
b.Outcome = "No Answer"
and
a.Start_date <= b.Date <= a.End_date
and
a.End_date <= b.Date+2
then 1 else 0 END as Cured
from have1 a, have2 b
where a.ID=b.ID;
quit;
The datasets I'm working with are:
data have1;
infile datalines dlmstr=' | ';
input ID Start_date :ddmmyy10. End_date :ddmmyy10.;
format date start_date date9.;
datalines;
ID | Start_date | End_date
1 | 01/01/2021 | 03/01/2021
1 | 20/01/2021 | 21/01/2021
2 | 05/01/2021 | 07/01/2021
3 | 10/01/2021 | 30/01/2021
3 | 25/01/2021 | 25/01/2021
;;;
run;
data have2;
infile datalines dlmstr=' | ';
input ID Date :ddmmyy10. Outcome ;
format Date date9.;
datalines;
ID | Date | Outcome
1 | 01/01/2021 | Answered
2 | 05/01/2021 | Asnwerphone Message Left
3 | 12/01/2021 | Answerphone Message Left
3 | 25/01/2021 | No Answer
;;;
run;
In this case I should get the following:
ID | Date | Outcome | Cured
1 | 01/01/2021 | Answered | 1
2 | 05/01/2021 | Asnwerphone Message Left | 1
3 | 12/01/2021 | Answerphone Message Left | 0
3 | 25/01/2021 | No Answer | 1
ID1 cured because he left have1
2 days after 01/01
ID2 also cured because they left have1
2 days after 05/01
ID3 didn't cure because they left have1
more than 15 days after 12/01
ID3 cured because they left have1
on the same day
This is the result with that code:
ID| Date | Outcome | Cured
1 01JAN2021 Answered 1
1 01JAN2021 Answered 0
2 05JAN2021 Asnwerph 0
3 12JAN2021 Answerph 0
3 25JAN2021 No Answe 0
3 12JAN2021 Answerph 0
3 25JAN2021 No Answe 1
Which is not right: first of all it duplicates observations, but it doesnt accurately selects those that should be 1 in Cured (ID 2).
Any idea what might be wrong/missing?