0

So I have a macro similar to this, with the objective of calculating information value:

%macro iv_calc(x,event,varlist);
data main_table;
set x(keep=event varlist.);
run;

/****Steps to compute IV ****/
%mend;

X is the name of the dataset, event is the dependent variable name and varlist has the names of all the independent variables in a macro variable format. The number of variables in varlist is unknown and could vary from 100 to 2000+. As a result, the macro is taking a very long time to run. I'm new to this, so my request is to understand if there's a way for me to split the varlist into 2, and run the same macro in parallel(because event is needed to compute information value), so as to reduce the runtime. My first thought was resorting to a shell script, but the number of variables is unknown and there lies the problem. Any tiny help will be greatly appreciated. Thanks a lot.

Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
IndigoChild
  • 842
  • 3
  • 11
  • 29
  • So where is the problem with shell script? Give one parameter to use first half and the other one use does second half. In sas split the list or divide odd even. – Lee Aug 19 '19 at 06:39
  • 2
    Please provide a [minimal, verifiable and complete example](https://stackoverflow.com/help/minimal-reproducible-example) of the code you are attempting to parallelise. Your current example is too limited for anyone to provide much useful advice. – user667489 Aug 19 '19 at 07:58
  • To be clear it's more likely the source code the macro generates is taking a long time... Sort of like taking 4 hours to plan an 8-day trip and saying the planning is taking a very long time. – Richard Aug 19 '19 at 13:43
  • Given the code you've provided, we don't know how you're using that macro variable and can offer no methods on speeding things up. There may or may not be ways, you need to provide more details. – Reeza Aug 19 '19 at 15:08

1 Answers1

1

Managing parallel execution in SAS is rather inconvenient and involves SAS MP Connect / SAS Grid (signon/rsubmit).

Parallel execution in the shell is much easier, for example:

echo "param1 param2 param3" | tr ' ' '\n' | xargs -i{} -P 2 ./run-sas.sh {}

-P 2 specifies the number of parallel processes. I covered passing parameters to a child SAS session in a recent answer.

Nickolay
  • 31,095
  • 13
  • 107
  • 185