Finding the Hessian matrix of this function

Question

Hi I have the following function:

sum from 1 to 5000 -log(1−(xi)^2) -log(1-(a_i)^t*x), where a_i is a random vector and we are trying to minimize this function's value via Netwon's method.

I need a way to calculate the Hessian matrix with respect to (x1, x2, x3, ...). I tried auto-gradient but it took too much time. Here is my current time.

from autograd import elementwise_grad as egrad
from autograd import jacobian
import autograd.numpy as np

x=np.zeros(5000);
a = np.random.rand(5000,5000)
def f (x):
  sum = 0;
  for i in range(5000):
      sum += -np.log(1 - x[i]*x[i]) - np.log(1-np.dot(x,a[i]))
  
  return sum;

df = egrad(f)
d2f = jacobian(egrad(df)); 
print(d2f(x));

I have tried looking into sympy but I am confused on how to proceed.

iacob · Answer 1 · 2021-03-31T09:59:06.247

1

PyTorch has a GPU optimised hessian operation:

import torch

torch.autograd.functional.hessian(func, inputs)

edited Mar 31 '21 at 09:59

answered Mar 31 '21 at 09:30

iacob

20,084
6
92
119

Michael · Answer 2 · 2021-03-31T09:34:09.407

You can use the regular NumPy vectorization array operations which will speed up significantly the execution of the program:

from autograd import elementwise_grad as egrad
from autograd import jacobian
import autograd.numpy as np
from time import time
import datetime


n = 5000
x = np.zeros(n)
a = np.random.rand(n, n)

f = lambda x: -1 * np.sum(np.log(1-x**2) + np.log(1-np.dot(a, x)))

t_start = time()
df = egrad(f)
d2f = jacobian(egrad(df))
t_end = time() - t_start
print('Execution time: ', datetime.datetime.fromtimestamp(t_end).strftime('%H:%M:%S'))

Output

Execution time: 02:02:27

In general, every time you deal with numeric data, you should avoid by all means using loops for calculations, as they usually become the bottleneck of the program due to the their header and the maintenance of the counter variable.

NumPy on the other hand, uses a very short header for each array, and is highly optimized, as you'd expect, for numeric calculations.

Note the x**2 which squares every item of x instead of x[i]*x[i], and the np.dot(a, x) which performs the np.dot(x, a[i]) in just one command (where x and a switch places to fit the required dimensions).

You can refer to this great e-book which will explain this point in a greater detail.

Cheers

Ok it seems I made a mistake in the post the problem isn't computing the Hessian function, but evaluating it for a specify x. Like when I try to print(df(x)) it takes took long — DockingBlade, Mar 31 '21 at 09:58

Finding the Hessian matrix of this function

2 Answers2