I have some data sets that I wrote some code to clean according to some methods according to some biological literature and then I want to split it into day and night (because they must be analyzed separately). It worked but now I need to do this for the full set which is WAY to many files for me to want to deal with one by one. So I am now trying to write a macro to split it into days and nights for me..
My data looks like so
Hour var1 var2 var3
1 123 90 100
2 122 99 108
...........
4 156 80 120
4 156 80 145
4 143 82 132
basically night has 1 obs per hour day 3. I also have this for many days.
Each dataset is named STUDYIDID#_first or STUDYID_ID#_last. I want to generate four datasets per dataset. So MYID111_first would create: MYID111_first_day_var1, MYID111_first_day_var2, MYID111_first_night_var1 , and MYID111_first_night_var2.
I would then LIKE to append them into 4 datasets: MYID_A_first_day_var1, MYID_A_first_day_var2, MYID_A_first_night_var1 , and MYID_A_first_night_var2.
MY CODE SO FAR:
%macro datacut(libname,worklib=work, grp = _A ,time1 = _night , time2 = _day type1 = _var1 , type2 = _var2);
%local num i;
proc datasets library=&libname memtype=data nodetails;
contents out=&worklib..temp1(keep=memname) data=_all_ noprint;
run;
data _null_;
set &worklib..temp1 end=final;
by memname notsorted;
if last.memname;
n+1;
call symput('ds'||left(put(n,8.)),trim(memname));
if final then call symput('num',put(n,8.));
run;
%do i=1 %to #
/* do the artifact removing method */
DATA &libname..&&ds&i;
SET &libname..&&ds&i;
PT_ID = '&ds&i' ;
IF var1< 60 OR var1> 230 then delete;
IF var2< 30 OR var2> 230 THEN delete;
IF var3< 60OR var3 > 135 THEN DELETE;
IF var2 > var1 then delete;
run;
/* get just the night values */
PROC SQL;
CREATE TABLE &libname..&&ds&i&time1 as
SELECT *
FROM &libname..&&ds&i
WHERE Hour BETWEEN 0 and 6 OR Hour BETWEEN 22 and 24
order by systolic
;
QUIT;
/* trim off the proper number of observations for variable 1 */
DATA &libname..&&ds&i&time1&type1;
SET &libname..&&ds&i&time1 end=eof;
IF _N_ =1 then delete;
if eof then delete;
run;
PROC append base= &libname..&&ds&time1&type1
data= &libname..&&ds&i&time1;
run;
QUIT;
%end;
%mend datacut;
%datacut(work)
Now the initial datastep works correctly but the later ones don't rename the data as planned. I get a bunch of datasets called Ds10_night_var1 with the wrong field names (memtype, nodetails, data)
I get the warning:
WARNING: Apparent symbolic reference DS1_NIGHT not resolved.
NOTE: Line generated by the macro variable "TIME1".
1 work.&ds1_night
-
22
200
ERROR 22-322: Expecting a name.
ERROR 200-322: The symbol is not recognized and will be ignored.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
WARNING: Apparent symbolic reference DS1_NIGHT_SYS not resolved.
22: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
COLUMN where the error has occurred.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;,
_DATA_, _LAST_, _NULL_.
201: LINE and COLUMN cannot be determined.
NOTE: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and COLUMN
where the error has occurred.
ERROR 201-322: The option is not recognized and will be ignored.
So I want the right names for my file AND my datasets to actually have data I and I don't understand why they don't.