2

I am looking to see how a function that computes the Hessian of a log-likelihood can be compiled, so that it can be efficiently used with different sets of parameters.

Here is an example.

Suppose we have a function that computes the log-likelihood of a logit model, where y is vector and x is a matrix. beta is a vector of parameters.

 pLike[y_, x_, beta_] :=
 Module[
  {xbeta, logDen},
  xbeta = x.beta;
  logDen = Log[1.0 + Exp[xbeta]];
  Total[y*xbeta - logDen]
  ]

Given the following data, we can use it as follows

In[1]:= beta = {0.5, -1.0, 1.0};

In[2]:= xmat = 
  Table[Flatten[{1, 
     RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];

In[3]:= xbeta = xmat.beta;

In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);

In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;

In[6]:= Tally[y]

Out[6]= {{1, 313}, {0, 187}}

In[9]:= pLike[y, xmat, beta]

Out[9]= -272.721

We can write its hessian as follows

 hessian[y_, x_, z_] :=
  Module[{},
   D[pLike[y, x, z], {z, 2}]
  ]


In[10]:= z = {z1, z2, z3}

Out[10]= {z1, z2, z3}

In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]

Out[11]= {0.1248040, Null}

In[12]:= AbsoluteTiming[
 Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]

Out[12]= {14.3524600, Null}

For efficiency reasons, I can compile the original likelihood function as follows

pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
   Module[
    {xbeta, logDen},
    xbeta = x.beta;
    logDen = Log[1.0 + Exp[xbeta]];
    Total[y*xbeta - logDen]
    ],
   CompilationTarget -> "C", Parallelization -> True,  
   RuntimeAttributes -> {Listable}
   ];

which yields the same answer as pLike

In[10]:= pLikeC[y, xmat, beta]

Out[10]= -272.721

I am looking for an easy way to obtain similarly, a compiled version of the hessian function, given my interest in evaluating it many times.

Amir
  • 10,600
  • 9
  • 48
  • 75
asim
  • 389
  • 2
  • 10

2 Answers2

7

Leonid already beat me, but I'll post my line of thought anyway just for laughs.

The main problem here is that compilation works for numerical functions whereas D needs symbolics. So the trick would be to first define the pLike function with the same amount of variables as needed for the particular size of matrices you intend to use, e.g,

pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}]

enter image description here

The Hessian:

D[pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}], {{z1, z2, z3}, 2}]

enter image description here

This function should be compilable as it depends on numerical quantities only.

To generalize for various vectors one could build something like this:

Block[{ny = 2, nx = 3, z1, z2, z3},
   hessian[
      Table[ToExpression["y" <> ToString[i] <> "_"], {i, ny}], 
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j] <> "_"], 
           {i, ny}, {j, nx}], {z1_, z2_, z3_}
   ] =
   D[
     pLike[
        Table[ToExpression["y" <> ToString[i]], {i, ny}], 
        Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], 
             {i, ny}, {j, nx}], {z1, z2, z3}
        ], 
     {{z1, z2, z3}, 2}
   ]
 ]

enter image description here

And this can of course be easily generalized for variable nx and ny.


And now for the Compile part. It's an ugly piece of code, consisting of the above and a compile and made suitable for variable y size. I like ruebenko's code more than mine.

ClearAll[hessianCompiled];
Block[{z1, z2, z3},
 hessianCompiled[yd_] :=
  (hessian[
     Table[ToExpression["y" <> ToString[i] <> "_"], {i, yd}], 
     Table[ToExpression["xr" <> ToString[i]<>"c"<>ToString[j] <>"_"],{i,yd},{j,3}],
     {z1_, z2_, z3_}
     ] =
    D[
     pLike[
      Table[ToExpression["y" <> ToString[i]], {i, yd}],
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], {i,yd},{j,3}],
      {z1, z2, z3}
     ], {{z1, z2, z3}, 2}
    ];
   Compile[{{y, _Real, 1}, {x, _Real, 2}, {z, _Real, 1}}, 
    hessian[Table[y[[i]], {i, yd}], Table[x[[i, j]], {i, yd}, {j, 3}],
      Table[z[[i]], {i, 3}]]]// Evaluate] // Quiet
   )
 ]

hessianCompiled[500][y, xmat, beta] // Timing 

{1.497, {{-90.19295669, -15.80180276, 6.448357845}, 
        {-15.80180276, -80.41058154, -26.33982586},
        {6.448357845, -26.33982586, -72.92978931}}}

ruebenko's version (including my edits):

(cf = mkCHessian[500, 3]; cf[y, xmat, beta]) // Timing

{1.029, {{-90.19295669, -15.80180276, 6.448357845}, 
         {-15.80180276, -80.41058154, -26.33982586}, 
         {6.448357845, -26.33982586, -72.92978931}}}

Note that both tests include compilation time. Running the calculation on its own:

h = hessianCompiled[500];
Do[h[y, xmat, beta], {100}]; // Timing
Do[cf[y, xmat, beta], {100}]; // Timing

(* timing for 100 hessians: 

   ==> {0.063, Null}

   ==> {0.062, Null}
*)
Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94
  • 1
    It is not clear to me, where the hessian is being compiled here. – asim Nov 21 '11 at 01:03
  • Well, I thought to be using something similar to what Leonid did, but since that answer has gone for now I have to provide something myself. Unfurtunately, that has to wait until tonight. – Sjoerd C. de Vries Nov 21 '11 at 06:24
3

Here is an idea based on the previous post(s): We construct the input to Compile symbolically.

mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
  With[{
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
   yi = Quiet[Part[y, #] & /@ Range[ys]],
   xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
   betai = Quiet[Part[beta, #] & /@ Range[bs]]
   },
  Print[args];
  Print[yi];
  Print[xi];
  Print[betai];
  Compile[Evaluate[args], 
   Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
  ]

And then generate the compiled function.

cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];

You then call that compiled function

cf[y, xmat, beta]

Please verify that I did not make a mistake; in de Vries's post y is of length 2. Mine is length 3. I am sure what is correct. And of course, the Print are for illustration...


Update
A version with slightly improved dimension handling and with variables localized:

ClearAll[mkCHessian];
mkCHessian[ys_Integer, bs_Integer] :=
 Module[
   {beta, x, y, args, xi, yi, betai},
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
   yi = Quiet[Part[y, #] & /@ Range[ys]];
   xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
   betai = Quiet[Part[beta, #] & /@ Range[bs]];
   Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
 ]

Now, with asim's definitions in In[1] to In[5]:

cf = mkCHessian[500, 3];
cf[y, xmat, beta]

(* ==> {{-8.852446923, -1.003365612, 1.66653381}, 
       {-1.003365612, -5.799363241, -1.277665283},
       {1.66653381, -1.277665283, -7.676551252}}  *)

Since y is a random vector results will vary.

Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94
  • I used 3 as just an example. I believe y in asim's case has length 500. – Sjoerd C. de Vries Nov 21 '11 at 15:04
  • Very elegant, by the way. I'll play with it a little tonight, but it looks like it should do its job well. – Sjoerd C. de Vries Nov 21 '11 at 15:25
  • @Sjoerd, thanks. If you have a play with it, have look for the following: If we want to compile 500 variables, this might not be the best way to go about, especially if compilation to "C" is required. Another thing to have a look at is the expression optimizer: h = D[{Cos[x y], Sin[x y]}, {{x, y}, 2}]; Compile[{x, y}, Evaluate[h]] compare that to Experimental`OptimizeExpression[h] so you can use the expression optimizer without the compiler. I never needed this, but it might come handy. –  Nov 21 '11 at 15:51
  • Oliver, I've been looking at your code and I noticed two problems: first you let the matrix xi be square, whereas its dimensions are determined by the dimensions of y and z (so no necessity to pass the x dimensions). There's also a problem with global variables. If y,x and beta are already defined (for instance, by having run asim's In[1] to In[5]) then your code doesn't work correctly anymore. I've localized the variables and made the code suitable for any dimension of y and z and placed it below your code. Could you have a look at it and finalize it as you like it? – Sjoerd C. de Vries Nov 21 '11 at 21:50
  • This looks good. Do I need to finalize this, or can you do it? Please go ahead if you want to. –  Nov 21 '11 at 22:32
  • Done. See also the update to my version: results and timings are the same. I assume both methods stand a good chance of being correct ;-) – Sjoerd C. de Vries Nov 21 '11 at 23:11
  • What is being done with the constructs like `Quiet[Part[y, #] & /@ Range[ys]]`? (Purely from a coding perspective, and not a mathematical one.) – Mr.Wizard Nov 22 '11 at 03:48
  • @mr.wizard they act as lists and matrices of symbolic variables, to be replaced with numbers after `D` and `Compile`. `Quit` kills the messages generated when indexing an unassigned variable. It works OK nevertheless. Interestingly, though pLike depends on y, the hessian doesn't. I commented on that at leonid's answer (now gone). – Sjoerd C. de Vries Nov 22 '11 at 06:34