3

I am trying my hand at SAS macros for the first time. My basic question is: I have a dataset with some 10000 variables. I need to take each column individually, create a new conditional variable, store the results, then move to the next column. These columns are not sequentially ordered. What is the syntax SAS uses to identify a column (similar to how "_N_" would identify a row)?

Here is more information. The data looks something like this:

ID  v1   v2   v3  ... v10000
01  3.2  1.5  7.8 ...   4.2
02  1.1  4.5  1.9 ...  10.7
..
 N  2.5  1.5  4.9 ...   7.3

I need to look at the values of v1, count how many obs are above a value x and how many are below a value x, record those numbers in a dataset, then move onto v2, v3, ... v10000. In the end, I'd have a dataset that would show the number of obs above value x and number of obs below value x for each of my 10000 variables.

I have written the code as I would have written it for one variable in standard SAS code and it works, and now my intention is to convert that code into macro code, but I do not know how to construct a loop that would move from one column to the next.

Any help or references you could give would be greatly appreciated.

Thanks.

Samantha
  • 45
  • 2
  • 6
  • Ok, more diligent research has led me to use arrays first, then proc means. Macro not even necessary. – Samantha Jul 08 '11 at 01:57

2 Answers2

3
%LET CUTOFF = 3.1415926 ; /* set this as your 'cutoff' value */

data counters ;
  set mydata end=eof ;

  array vi{*} v1-v10000 ; /* incoming values */
  array vc{*} c1-c10000 ; /* counters */

  retain vc . ;

  do i = 1 to dim(vi) ;
    if vi{i} >= &CUTOFF then vc{i} + 1 ;
  end ;

  if eof then output ;

  keep c1-c10000 ;
run ;
Chris J
  • 7,549
  • 2
  • 25
  • 25
  • Thanks! That's pretty. I used "do over" and set a binary variable, which I then totaled. Yours is much more efficient. – Samantha Jul 11 '11 at 19:46
  • @Samantha: I realize this is about 4 years late, but if I'm not mistaken, `do over` is deprecated in favor of an approach like this one. Just FYI. – Alex A. Feb 10 '15 at 15:14
  • @Alex: Thanks! I appreciate the extra info. – Samantha Feb 11 '15 at 16:37
0

Not going to be the most efficient way but this will get you 10000 individual data sets.

%macro splitdata;
%do i=1 %to 10000;
data v_&i;
   set v;
   array vArray[10000] v1-v10000;
   keep vArray[&i]; 
run;
%end splitdata;

%splitdata;

From there you could employ the same sort of macro do loop on each data set v_1, v_2,....

DataParadigms
  • 437
  • 2
  • 11
  • I must not have been clear in my question -- I wanted one dataset, not many. But you are right in that arrays are the answer. I just did not know the terminology! Now I do. :-) Thanks very much for the reply. – Samantha Jul 08 '11 at 17:26