I got stuck in one task.
Now I have a variable called placement which has two categories, out of home and at home (A and B for short here). A case can be placed at AAABAB sequentially. Each placement has a starting date and ending date, for instance, 01/20/2015-2/21/2015 for first A, 2/21/2015-8/05/2015 for second A....etc.
Now I structured data like this, 111.1. (A as 1 and B as missing). How can I combine out-of-home placement time? That is, use the first A's starting date as starting date, use the third A's ending date as ending date. This will be the first period of time that the case is out-of-home. I am thinking about using a do loop to do this. But cannot figure out how.
I think, another way to do this, is to identify out-of-home groups, for example, AAABAABAAAA can be labeled as 11123345555. Then I can use PROC SQL, min and max function, to find starting date and ending date.
proc sql; create table comb as select *,min(strtdt) as minstrtdt, max(livarenddt) as maxenddt from limt group by caseid,groupid; quit;
Original data, like this, (out indicates out of home, 1 means yes, . means no)
Obs caseid livarstrtdt livarenddt placemnt out
81 00040903 14SEP2010 09DEC2010 01 .
82 00040903 09DEC2010 28FEB2011 02 1
83 00040903 28FEB2011 01APR2011 02 1
84 00040903 01APR2011 01JUL2011 02 1
85 00040903 01JUL2011 08AUG2012 02 1
86 00040903 08AUG2012 05NOV2014 02 1
87 00040903 05NOV2014 05NOV2014 03 .
88 00040903 12AUG2008 13AUG2008 12 1
89 00040903 13AUG2008 13AUG2008 01 .
90 00040903 13AUG2008 21AUG2008 01 .
I want the output like this,
Obs caseid livarstrtdt livarenddt placemnt out
81 00040903 14SEP2010 09DEC2010 01 .
82 00040903 09DEC2010 05NOV2014 02 1
87 00040903 05NOV2014 05NOV2014 03 .
88 00040903 12AUG2008 13AUG2008 12 1
89 00040903 13AUG2008 13AUG2008 01 .
90 00040903 13AUG2008 21AUG2008 01 .
Anyone have ideas? Thanks!!
UPDATE,
Now I modified my data, caseid livaropnseq livarstrtdt livarenddt placemnt out
183 00040903 01 14SEP2010 09DEC2010 01 .
184 00040903 01 09DEC2010 28FEB2011 02 1
185 00040903 01 28FEB2011 01APR2011 02 1
186 00040903 01 01APR2011 01JUL2011 02 1
187 00040903 01 01JUL2011 08AUG2012 02 1
188 00040903 01 08AUG2012 05NOV2014 02 1
189 00040903 01 05NOV2014 05NOV2014 03 .
190 00040903 02 12AUG2008 13AUG2008 12 .
191 00040903 02 13AUG2008 13AUG2008 02 1
192 00040903 02 13AUG2008 21AUG2008 02 1
As you can see, rows of different livaropnseq should not be combined together. For example Row 189 and 190, even though the case was at home at both time period, they belong to different livaropnseq (different open sequences). How can I incorporate this variable into the code in order to avoid collapsing them together? Greatly appreciated!!!
The output should be like this, Obs caseid livaropnseq livarstrtdt livarenddt placemnt out
183 00040903 01 14SEP2010 09DEC2010 01 .
184 00040903 01 09DEC2010 05NOV2014 02 1
189 00040903 01 05NOV2014 05NOV2014 03 .
190 00040903 02 12AUG2008 13AUG2008 12 .
191 00040903 02 13AUG2008 21AUG2008 02 1
UPDATE AGAIN, i USED THE following code to get what I want FINALLY!
data spell6;
set spell4;
by caseid livaropnseq out notsorted ;
group + first.out;
if first.caseid then group=1;
if first.out then startdt=livarstrtdt ;
retain startdt;
if last.out;
enddt = livarenddt ;
format startdt enddt date9.;
*drop livarstrtdt livarenddt ;
run;
Thank you so much, EVERYONE!! I really appreciated it!