-1
function A=obj(b,Y, YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, year1, year2, year3, year4, year5, year6, year7, workregion1, workregion2, workregion3, workregion4, workregion5, workregion6, workregion7, workregion8, workregion9, workregion10, workregion11, workregion12, workregion13, workregion14, workregion15, workregion16, qw)
 A=0;
for i=1:715
S=0;
for m=1:12
P=1;
for t=1:8
P=P*(normcdf((2*Y(i,t)-1)*(b(1)*YL(i,t)+b(2)*tempcyc(i,t)+b(3)*cyc(i,t)+b(4)*Yzero(i,t)+b(5)*age(i,t)+b(6)*agesq(i,t)+b(7)*educ(i,t)+b(8)*ageav(i,t)+b(9)*agesqav(i,t)+b(10)*educav(i,t)+b(11)*1+b(12)*sqrt(2)*qv(m,1)+b(13)*year1(i,t)+b(14)*year2(i,t)+b(15)*year3(i,t)+b(16)*year4(i,t)+b(17)*year5(i,t)+b(18)*year6(i,t)+b(19)*year7(i,t)+b(20)*workregion1(i,t)+b(21)*workregion2(i,t)+b(22)*workregion3(i,t)+b(23)*workregion4(i,t)+b(24)*workregion5(i,t)+b(25)*workregion6(i,t)+b(26)*workregion7(i,t)+b(27)*workregion8(i,t)+b(28)*workregion9(i,t)+b(29)*workregion10(i,t)+b(30)*workregion11(i,t)+b(31)*workregion12(i,t)+b(32)*workregion13(i,t)+b(33)*workregion14(i,t)+b(34)*workregion15(i,t)+b(35)*workregion16(i,t))));
end

S=S+qw(m,1)*P;
end

A=A+log(S/sqrt(pi));
A=-A;
end

This is my objective function I am minimizing (actually maximizing; I am minimizing -A) and the parameters I am estimating is b=[b(1)............b(35)]. Y, YL, tempcyc,.........., qw are matrices that I am imported as data. The objective function consists of a product(across t=1:8) nested within a sum (across m=1:12) that is in turn nested within a sum (across n=1:715). And below is my code using fminsearch for my unconstrained minimization.

% %% Minimization
start=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0];
iter=50000;
options=optimset('Display','iter','MaxIter',iter,'MaxFunEvals',100000); 
[b, fval, exitflag, output]=fminsearch(@(b)obj(b,Y, YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, year1, year2, year3, year4, year5, year6, year7, workregion1, workregion2, workregion3, workregion4, workregion5, workregion6, workregion7, workregion8, workregion9, workregion10, workregion11, workregion12, workregion13, workregion14, workregion15, workregion16, qw), start, options);%
%% Results
fprintf('[b] : % 1.4e % 1.4e %1.4e % 1.4e % 1.4e % 1.4e \n',...b(1),b(2),b(3),b(4), b(5),b(6),b(7),b(8), b(9),b(10),b(11),b(12), b(13),b(14),b(15),b(16), b(17),b(18),b(19),b(20), b(21),b(22),b(23),b(24), b(25),b(26),b(27),b(28), b(29),b(30),b(31),b(32), b(33),b(34), b(35));
 end

The problem is that the optmization results do not show up even after 12+ hours (It is still reflecting, expanding, contracting inside, etc.) Could someone give me an idea on how to make the process faster?

Thank you.

Rody Oldenhuis
  • 37,726
  • 7
  • 50
  • 96
  • Did you read http://www.mathworks.com/help/optim/ug/choosing-a-solver.html – dranxo Jun 20 '14 at 08:58
  • Yes I did; I believe my only options are fminunc and fminsearch. At first I used fminunc, but the results just stopped at the first iteration and there was no further progress..so I switched to fminsearch. fminsearch is a bit faster; it makes about 10 iterations per minute, but still the final results do not show up. – user3718095 Jun 20 '14 at 09:29
  • I have the impression your code will not run out of the box because at least 20 parameters are not defined. Without running code it's difficult to judge behavior. `fminsearch` is a well tested and not to inefficient function. – NoDataDumpNoContribution Jun 20 '14 at 09:36
  • @Trilarion What do you mean by 'at least 20 parameters are not defined'? Do you mean that there are too many parameters to estimate? b(1), b(2),...., b(35)? Or is there something wrong with my code? The rest of the parameters Y, YL, tempcyc,.........., qw are parameters that I "passed on" and defined by my imported data. And the program does run, no errors, but the process is really slow.. – user3718095 Jun 20 '14 at 10:21
  • So what are the values of `Y`, `YL`, `tempcyc`, `cyc`, `Yzero`, `age`, `agesq`, `educ`, `ageav`, `agesqav`, `educav`, `qv`, `year1`, `year2`, `year3`, `year4`, `year5`, `year6`, `year7`, `workregion1`, `workregion2`, `workregion3`, `workregion4`, `workregion5`, `workregion6`, `workregion7`, `workregion8`, `workregion9`, `workregion10`, `workregion11`, `workregion12`, `workregion13`, `workregion14`, `workregion15`, `workregion16`, and `qw` you are using? – Rody Oldenhuis Jun 20 '14 at 11:08
  • And how on Earth do you check for bugs on a line of code 713 characters long!? Please tell me you generated that somehow... – Rody Oldenhuis Jun 20 '14 at 11:10
  • Are you aware of the fact that the statement `A = -A` is inside the outer loop? That means `A` will switch sign 715 times; I suspect you'd want to move that out of the loop. – Rody Oldenhuis Jun 20 '14 at 11:13
  • It sure sounds to me like you have a nasty bug in your objective function, because `fminsearch` uses the Nelder/Mead simplex method which scales very poorly with increasing dimensions. `fminunc` ((quasi) Newton) is the function of choice for problems of larger dimensionality (and yes, 35 can safely be called "large"). The fact that `fminunc` terminates is a sign there is something wrong with your code, ***not*** that the algorithm sucks (one of the more important lessons you should learn as a developer is humility; it's usually *your code* that's wrong, not the library's). – Rody Oldenhuis Jun 20 '14 at 11:19
  • Anyway, as it stands, your code is so poorly written that I'm not going to continue to help you fix it. Your example does not work (missing parameter values), your lines are uncommented, unintelligible spaghetti, you have obvious design flaws (like that sign flip on every iteration) that should have popped up immediately in basic debugging, you run on the full-scale problem rather than a tiny toy version to verify your objective function, and so on. Please fix these things first, and should you still have problems, feel free to ask. – Rody Oldenhuis Jun 20 '14 at 11:23
  • @ Rody Oldenhuis Y, YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, year1, year2, year3, year4, year5, year6, year7, workregion1, workregion2, workregion3, workregion4, workregion5, workregion6, workregion7, workregion8, workregion9, workregion10, workregion11, workregion12, workregion13, workregion14, workregion15, workregion16, and qw -> each of these are txt files that I imported into Matlab; they are matrices containing actual data. So my objective function contains elements of this matrices, for example (i,t) element of the matrix in cyc.txt – user3718095 Jun 20 '14 at 11:26
  • @ Rody Oldenhuis So when I ran this program I did not bother to "define" Y, YL,.........., workregion16, and qw like Y=Y, Y=YL, etc. because it's already there in the workspace after I imported all the txt files – user3718095 Jun 20 '14 at 11:32
  • 1
    Please understand that we cannot reliably reproduce your problems when we don't have the same data. You'll get better answers much quicker if you write a minimal working example; please read for example, [this](http://meta.tex.stackexchange.com/questions/228/ive-just-been-asked-to-write-a-minimal-example-what-is-that) on why and how to write one. – Rody Oldenhuis Jun 20 '14 at 11:38

1 Answers1

2

This is not a complete answer to your question; this is a suggestion on how to better write your code, which is probably half the answer.

I would write your calling script something like this:

%% Minimization
b0   = zeros(1,35);
iter = 5e4;
options=optimset(...
    'Display'    , 'iter',...
    'MaxIter'    , iter,...
    'MaxFunEvals', 1e5);

data = {...
    YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, ...
    year1, year2, year3, year4, year5, year6, year7, ...
    workregion1, workregion2, workregion3, workregion4, workregion5, ...
    workregion6, workregion7, workregion8, workregion9, workregion10, ...
    workregion11, workregion12, workregion13, workregion14, workregion15, ...
    workregion16
};

[b, fval, exitflag, output] = fminsearch(@(b)obj(b, Y,data,qw), b0, options);

%% Print results
fprintf('[b]:');
fprintf('%1.4e', b);
fprintf('\n');

(actually, I would also wrap the data in cells instead of having 16 workregions and 7 years etc. But you may not be the creator of the data, so this may not always be possible)

You could re-write your objective function something like this:

function A = obj(b, Y, others, qw)

    A   = 0;    
    sPi = -0.5*log(pi);
    bB  = num2cell(b);    
    for ii = 1:715                
        P = prod( normcdf((2*Y(ii,1:8)-1) .* cellfun(@(x,y) x(ii,1:8).*y, others, bB)) );
        S = sum( P*qw(1:12,1) );
        A = A + log(S) + sPi;
    end

    A = -A;

end

Probably also the outer loop can be simplified into vectorized form. But I cannot test the correctness of all this (or whether it runs at all), because I don't have access to your data. But I trust you get the idea -- you can now at least read your code, and probably spot the error yourself.

Rody Oldenhuis
  • 37,726
  • 7
  • 50
  • 96