1

I am working on a macro for regressions using the following code:

%Macro Regression;

%let index = 1;

%do %until (%Scan(&Var2,&index," ")=);

%let Ind = %Scan(&Var2,&index," ");

ods output SelectionSummary = SelectionSummary;

proc reg data = Regression2 plots = none;

model &Ind = &var / selection = stepwise maxstep=1;

output out = summary R = RSQUARE;

run;

quit;

%if &index = 1 %then %do;

data final;
set selectionsummary;
run;

%end;

%else %do;

data final;
set final selectionsummary;
run;

%end;

%let index = %eval(&Index + 1);

%end;

%mend;

%Regression;

This code works and gives me a table which highlights the independent variable that explains with the most variation the dependent variable.

I'm looking for a way to run this but the regression gives me the three best independent variables to explain the dependent variable if it was chosen to be the first variable, for example:

models chosen:

GDP = Human Capital
GDP = Working Capital
GDP = Growth

DependentVar Ind1          Ind2            Ind3    Rsq1 Rsq2 Rsq3
GDP          human capital working capital growth  0.76 0.75 0.69

or

DependentVar Independent1    Rsq
GDP          human capital   0.76
GDP          working capital 0.75
GDP          growth          0.69

EDIT:

It would be an absolute bonus if there is a way to put stepwise maxstep = 3 and have the best three independent variable combinations for each dependent variable with the condition that the first independent variable is unique.

TIA.

Reeza
  • 20,510
  • 4
  • 21
  • 38
78282219
  • 593
  • 5
  • 21
  • How are you defining the `three best independent variables'? – Reeza Apr 11 '18 at 18:50
  • I guess it depends on how SAS defines them, in a stepwise regression it is chosen based upon which explains the most variance, i.e. has the best R^2 – 78282219 Apr 12 '18 at 05:54
  • You're using the default settings? Did you check what that actually is, because it's not R^2, its the F stat. The R squared is a different option. I would highly advise you to explicitly determine your methodology . – Reeza Apr 12 '18 at 16:21
  • http://documentation.sas.com/?docsetId=statug&docsetTarget=statug_reg_details08.htm&docsetVersion=14.3&locale=en#statug_reg003824 – Reeza Apr 12 '18 at 16:21
  • I was using the stepwise regression and then picking out the R^2 from the summary statistics and outputting into a different table – 78282219 Apr 16 '18 at 15:14
  • thank you, i'll read thisnow – 78282219 Apr 16 '18 at 15:15

1 Answers1

1

Try STOP=3 option on your model statement. It will fit the best model with up to three variables. However, it does not work with the stepwise option, but will work with the R^squared option.

model &Ind = &var / selection = maxR stop=3;

If you only want to consider 3 variable models include start=3 as well.

model &Ind = &var / selection = maxR stop=3 start=3;
Reeza
  • 20,510
  • 4
  • 21
  • 38
  • Quick Q, how would I put the R^2 into a table for comparison? the output is different in this case. my usual code as signalled above doesn't work – 78282219 Apr 16 '18 at 16:52
  • Find the relevant ODS table and capture that is likely the easiest solution. The OUTPUT statement should still generate the R squared values. – Reeza Apr 16 '18 at 17:48
  • Unfortunately for MaxR the R-squared is only shown as a line before any summary tables, the output table labels are: Diagnostic Panel ANOVA – 78282219 Apr 17 '18 at 05:45