0

I have a negative log-likelihood function to minimize. I want to set the array of observations as a parameter of the function to optimize and not directly into the function, but strangely, the optimizer explodes. I am interested to discover why this is, and eventually to understand what needs to be changed to have a converging optimizer.

I set the observations as parameters of the function this way: mn stands for scipy.optimize.minimize

def f(x, d ):
    alfa = x[0]
    lambda_ = x[1] 
    return - n * np.log(alfa) * lambda_ + alfa * sum(d) 

n = 2000 #number of observations 
y = np.random.exponential(2 , n) #vector of observations 

res = mn(f, x0 = [2,1/2], args = y)

and the results are:

    fun: nan
 hess_inv: array([[0.67448386, 0.61331579],
       [0.61331579, 0.55866767]])
      jac: array([nan, nan])
  message: 'Desired error not necessarily achieved due to precision loss.'
     nfev: 452
      nit: 2
     njev: 113
   status: 2
  success: False
        x: array([-2947.66055677, -2680.19131049])

whereas if I set the observations internally and not as a parameter

def f(x):
   alfa = x[0]
   lambda_ = x[1]
   n = 2000
   y = np.random.exponential(2 , n)
   return - n * np.log(alfa) * lambda_ + alfa * sum(y)

mn(f, x0 = [2,2])

I get some quite good estimations

    fun: 5072.745186459168
 hess_inv: array([[ 3.18053796e-16, -1.07489375e-15],
       [-1.07489371e-15,  3.63271745e-15]])
      jac: array([1.65160556e+10, 1.11412293e+10])
  message: 'Desired error not necessarily achieved due to precision loss.'
     nfev: 122
      nit: 3
     njev: 28
   status: 2
  success: False
        x: array([1.99998635, 1.99999107])

Even if the optimizer doesn't consider it as a success.

halfer
  • 19,824
  • 17
  • 99
  • 186
Joachim
  • 490
  • 5
  • 24
  • Please include all the imports in the code exapmle. In particular, what is `mn`? – MB-F Mar 14 '19 at 09:46
  • Oh, and please note that the "quite good estimations" are basically just the starting guess values. The optimizer did not finish successfully (`success: False`). – MB-F Mar 14 '19 at 09:50
  • @kazemakase sorry I forgot to mention that mn stands for scipy.optimize.minimize – Joachim Mar 14 '19 at 11:04

1 Answers1

0

Part 1

First, let me answer the explicit question:

Could someone please explain me why, and eventually tell me what to change to have a converging optimizer ?

The function - n * np.log(alfa) * lambda_ + alfa * sum(d) can't be minimized. For a given finite lambda, the function has a unique minimum in alpha because the terms -log(alfa) and +alfa grow at different rates. Now let's look at lambda in isolation. The function is linear in lambda and therefore has a minimum only at lambda_ = +inf.

The optimizer is trying to increase lambda towards infinity, but increasing lambda also increases the alfa-minimum and things diverge quickly...

At least in theory. In practice, sooner or later the optimizer tries a negative alpha, which causes the logarithm to become invalid. For some reason this makes the optimizer think that it is a good idea to decrease alpha even further.

Two steps to achieve a converging optimizer:

  1. Use a cost function that can be minimized (e.g. make lambda fixed, or put a penalty on large lambdas)
  2. Tell the optimizer alpha must not be negative with bounds=[(0, np.inf), (-np.inf, np.inf)])

Part 2

What's happening in the "not as parameter" scenario is an entirely different story. With y = np.random.exponential(2 , n) the cost function generates new random values each time it is called. This makes the cost function stochastic but the optimizer expects a deterministic function.

Each time the optimizer calls the function it gets a partially random result. This confuses the optimizer and after three iterations (nit: 3) it gives up. In three steps, the estimate has only slightly deviated from the initial values.

MB-F
  • 22,770
  • 4
  • 61
  • 116