2

I have a loglikelihood function which is the sum over a very long list of customers, of some individual loglikelihood functions, and I want to optimpize it using the scipy.optimize.minimize() method.

def log_likelihood_individual(r, alpha, a, b, x, tx, t):
    ln_a1 = gammaln(r + x) - gammaln(r) + r * log(alpha)
    ln_a2 = gammaln(a + b) + gammaln(b + x) - gammaln(b) - gammaln(a + b + x)
    ln_a3 = -(r + x) * log(alpha + t)
    a4 = 0
    if x > 0:
        a4 = exp(log(a) - log(b + x - 1) - (r + x) * log(alpha + tx))
    return ln_a1 + ln_a2 + log(exp(ln_a3) + a4)


def log_likelihood(r, alpha, a, b, customers):
    if r <= 0 or alpha <= 0 or a <= 0 or b <= 0:
        return -np.inf
    c = sum([log_likelihood_individual(r, alpha, a, b, x, tx, t) for x, tx, t in customers])
    return c


def maximize(customers):
    negative_ll = lambda params: -log_likelihood(*params, customers=customers)
    params0 = np.array([1., 1., 1., 1.])
    res = minimize(negative_ll, params0, method='CG')
    return res

I try with various algorithms of the scipy list but each time, the algorithm loses itself. Can anyone give me a general advice for how to tackle these kind of problems, i.e., minimizing a function I can't really understand?

sweeeeeet
  • 1,769
  • 4
  • 26
  • 50

1 Answers1

1

A general question provokes a general answer ;)

Most of my fit attempts fail (i.e. don't converge) because of poor-conditioned initial values. Ask yourself:

  • Is params0 = np.array([1., 1., 1., 1.]) really a good initial guess?
  • Did you also try params0 = np.array([0., 0., 0., 0.]) or any other combination (brute force)
  • Can you create an example set where you know the ideal values for the parameters? Did you try to fit it?

If none of the above works out, the problem seems to be more sophisticated, but 90% of fitting problems can be solved by answering the questions above.

jkalden
  • 1,548
  • 4
  • 24
  • 26
  • Thanks for the advice, indeed the params were not ideal and I modif them a little, but I still get no results since my guesses are totally empirical. My best guess for the moment is: [2.5,10,0.001,10] – sweeeeeet Dec 19 '14 at 09:58
  • How about a test subset, then? – jkalden Dec 19 '14 at 10:12
  • 1
    Create a set of maybe 10 customers where you know what the result should be. Does it converge, and if so, are the fit parameters close to your known best fit? – jkalden Dec 19 '14 at 10:16
  • If the solution is fine for you, please [mark the answer as accepted](http://stackoverflow.com/help/someone-answers) for other users to see there is a solution. Thank you! – jkalden Oct 26 '15 at 08:57
  • I am sorry but this answer is not what I am looking for, I can't validate it. – sweeeeeet Oct 26 '15 at 11:17