I have a dataset with 5 groups and I want to use the DS2 procedure in SAS to concurrently compute group means.
Simulated dataset:
data sim;
call streaminit(7);
do group = 1 to 5;
do pt = 1 to 500;
x = rand('ERLANG', group);
output;
end;
end;
run;
How I envision it working is that each of 5 threads receives a subset of the data corresponding to a particular group. The mean of x
is calculated on each subset like so:
proc ds2;
thread t / overwrite=yes;
dcl double n sum mean;
method init();
n = 0;
sum = 0;
mean = .;
end;
method run();
set sim; /* Or perhaps a subsetted dataset */
sum + x;
n + 1;
end;
method term();
mean = sum / n;
output;
end;
endthread;
...
quit;
The problem is, if you call a thread that processes a dataset like below, rows are sent to the 5 threads all willy-nilly (i.e. irrespective of groups).
data test / overwrite=yes;
dcl thread t t_instance;
method run();
set from t_instance threads=5;
end;
enddata;
How can I tell SAS to subset the data by group
and pass each subset to its own thread?