I want to find the most optimal model specification for a Logit Regression with a dependent variable that is multinomial distributed. Y has three outcomes, and I want to make a forecasting model with 2 variables - a lagged and differenced spot rate Time-series and a time-series of the estimated realized Volatility.
My initial thought was that I create a loop that goes through each specification, and outputs the AIC value, then I can backtrack and find the most optimal model.
This is working, but there's a hitch. I want to look at the spot rate in the following way (example): Spot_t - Spot_t-n (n could be 21). This opens up for a whole lot specifications. In my trial regression I included 12 variables of each, each lagged by 21 days * number of variable. This gave a good model, but I think I need a better iterative process.
If i limit my model to include 12 variables/lags of each variable, we are talking 24 loops. Within these loops there will be many of the same iterations, which is time-consuming and silly in my opinion. Maybe there is a way to bypass this issue.
I am not used to code in SAS. I have decent experience in VBA.
My code is cropped in below, and if you have any idea how to do this differently I would really appreciate it! Maybe it's possible to do with arrays or something like that - but I am not used to SAS programming, so maybe you could shed some light on how to do all this :)
%macro Selectvariables;
%let y = 0;
%let z = 2;
%do a = 1 %to &z;
%do b = 1 %to &z;
%do c = 1 %to &z;
%do d = 1 %to &z;
%do e = 1 %to &z;
%do f = 1 %to &z;
%do g = 1 %to &z;
%do h = 1 %to &z;
%do i = 1 %to &z;
%do j = 1 %to &z;
%do k = 1 %to &z;
%do l = 1 %to &z;
%do m = 1 %to &z;
%do n = 1 %to &z;
%do o = 1 %to &z;
%do p = 1 %to &z;
%do q = 1 %to &z;
%do r = 1 %to &z;
%do s = 1 %to &z;
%do t = 1 %to &z;
%do u = 1 %to &z;
%do v = 1 %to &z;
%do w = 1 %to &z;
%do x = 1 %to &z;
%let First_Spot_var = Spotlag_&a;
%let Second_Spot_var = Spotlag_&b;
%let Third_Spot_var = Spotlag_&c;
%let Fourth_Spot_var = Spotlag_&d;
%let Fifth_Spot_var = Spotlag_&e;
%let Sixth_Spot_var = Spotlag_&f;
%let Seventh_Spot_var = Spotlag_&g;
%let Eighth_Spot_var = Spotlag_&h;
%let Nine_Spot_var = Spotlag_&i;
%let Tenth_Spot_var = Spotlag_&j;
%let Eleventh_Spot_var = Spotlag_&k;
%let Twelveth_Spot_var = Spotlag_&l;
%let First_vol_var = vollag_&m;
%let Second_vol_var = vollag_&n;
%let Third_vol_var = vollag_&o;
%let Fourth_vol_var = vollag_&p;
%let Fifth_vol_var = vollag_&q;
%let Sixth_vol_var = vollag_&r;
%let Seventh_vol_var = vollag_&s;
%let Eighth_vol_var = vollag_&t;
%let Nine_vol_var = vollag_&u;
%let Tenth_vol_var = vollag_&v;
%let Eleventh_vol_var = vollag_&w;
%let Twelveth_vol_var = vollag_&x;
%let Name = Model_&y;
proc Logistic data=CurrencyData;
&Name.: model Y1_Optimal_Strategy_3M = &First_Spot_var &Second_Spot_var &Third_Spot_var &Fourth_Spot_var &Fifth_Spot_var &Sixth_Spot_var &Seventh_Spot_var &Eighth_Spot_var &Nine_Spot_var &Tenth_Spot_var &Eleventh_Spot_var &Twelveth_Spot_var &First_vol_var &Second_vol_var &Third_vol_var &Fourth_vol_var &Fifth_vol_var &Sixth_vol_var &Seventh_vol_var &Eighth_vol_var &Nine_vol_var &Tenth_vol_var &Eleventh_vol_var &Twelveth_vol_var;
ods output FitStatistics=AIC_&Name(where=(criterion="AIC"));
run;
%let y = %Eval(&y+1);
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
data AllAIC;
set AIC_: INDSNAME=modelVars;
dsname = scan(modelVars, 2);
run;
proc sort data=AllAIC out=allAIC_Sorted;
by InterceptAndCovariates;
run;
proc Print; run;
%mend;
Sorry for the crazy wide code. Hope you can help me. Maybe i am overcomplicating the issue. :)
Thanks a lot. Best regards, Christian
EDIT: I have set z = 2
just for testing purposes. Ideally this would be considerably higher.