Estimating Weibull with MLE and Newton-Raphson

Question

I have been trying to estimate the two-parameter Weibull distribution with a Newton method. As I was reading a bit about using Newton-Raphson algorithm, I found it challenging to understand some aspects.

I've tried to implement it in Python and tbh I see no wrong in my approach. But since I was struggling to understand the algorithm itself, I assume I am missing something. My code runs, the problem is that it doesn't find the correct estimates (1.9 and 13.6):

#data input in  the Weibull dist.
t = np.array(list(range(1, 10)))
t = np.delete(t,[0])

#calculating the first and second partial derivative of Weibull log-likelihood function
def gradient(a,b): 
    for i in t: 
        grad_a = np.array(-10*b/a + b/a*np.sum((i/a)**b),dtype = np.float)
        grad_b = np.array(10/b - 10*(math.log(a)) + np.sum(math.log(i)) - np.sum(((i/a)**b)*math.log(i/a)),np.float)     
        grad_matrix = np.array([grad_a, grad_b])
    return grad_matrix
    
def hessian(a,b): 
    for i in t: 
        hess_a = np.array((10*b/a**2 + (b*(b+1)/a**2)*np.sum((i/a)**b)),np.float)
        hess_b = np.array(10/b**2 + np.sum(((i/a)**b) * (math.log(i/a))**2),np.float)
        hessians = np.array([hess_a, hess_b]) 
    return hessians  

#Newton-Raphson
iters = 0     
a0, b0 = 5,15

while iters < 350:  
    if hessian(a0,b0).any() == 0.0:
        print('Divide by zero error!') 
    else:
        a = a0 - gradient(a0,b0)[0]/hessian(a0,b0)[0]
        b = b0 - gradient(a0,b0)[1]/hessian(a0,b0)[1]    
        print('Iteration-%d, a = %0.6f, b= %0.6f, e1 = %0.6f, e2 = %0.6f' % (iters, a,b,a-a0,b-b0))    
    if math.fabs(a-a0) >0.001 or math.fabs(b-b0) >0.001:
        a0,b0 = a,b
        iters = iters +1
    else: 
        break
print(a,b)
print(iters)    

**Output:**             
Iteration-0, a = 4.687992, b= 16.706941, e1 = -0.312008, e2 = 1.706941          
Iteration-1, a = 4.423289, b= 18.240714, e1 = -0.264703, e2 = 1.533773                
Iteration-2, a = 4.193403, b= 19.648545, e1 = -0.229886, e2 = 1.407831

and so on with each iteration being further and further away from the correct estimate of the second paramet (b).

Weibull pdf: http://www.iosrjournals.org/iosr-jm/papers/Vol12-issue6/Version-1/E1206013842.pdf

Could you give the equations of you 2-param Weibull distribution? I'd like to check your gradiant and Hessian. By the way, it appears to me that you are just overwriting your grad_a and grad_b in your for-loop, instead of using +=. However, without exact notation I cannot easily verify your code. The Newton part seems to be OK. — flow_me_over, Mar 24 '21 at 13:24
@flow_me_over, thank you so much for confirming that the NR at least seems okay! I used the following Weibull pdf: f(t; a, b) = b/a * (t/a)^(b-1)*exp{-(t/a)^b}. It corresponds to eq. (3.1) in the paper that attached in my edited post, from which I also took the gradient and hessian. The derivatives are taken from the log-likelihood of Weibull pdf. — kittycat, Mar 24 '21 at 19:23
@flow_me_over, can it be the problem that I'm using a continuous Weibull pdf to derive the derivatives while my t is discrete... — kittycat, Mar 25 '21 at 00:08

score 0 · Accepted Answer · edited Aug 08 '21 at 19:09

OK. So first, let me mention that the paper you are using is not clear and it surprises me this work has been able to enter a journal. Second, you state that your input data 't', which is 'x' in the paper, is a list of numbers from 0 to 9? I could not find this in the paper, but I'm assuming that this is correct.

Below I have updated your gradient function, which was quite verbose and tricky to read. I have vectorized it for you using numpy. See if you understand it.

Your Hessian is incorrect. I believe there are some wrong signs in the second derivatives in the paper, and thus in yours. Maybe go over the derivation of them again? Nonetheless, regardless of the sign changes, your Hessian is not well defined. A 2x2 Hessian matrix contains the second derivatives d^2 logL / da^2 and d^2 logL /db^2 on the diagonal, and the derivative d^2 log L /da db on the off diagonal (positions (1,2) and (2,1) in the matrix). I have adjusted your code, but not that I have not corrected the probably erroneous signs.

To conclude, you might want to review your NR code again based on the Hessian changes and make a while loop that ensures the algorithm stops after you meet your tolerance level.

#data input in the Weibull dist.
t = np.arange(0,10) # goes from 0 to 9, so N=10
N = len(t)

#calculating the first and second partial derivative of Weibull log-likelihood 
#function
def gradient(a,b): 
    grad_a = -N*b/a + b/a*np.sum(np.power(t/a,b))
    grad_b = N/b - N*np.log(a) + np.sum(np.log(t)) - np.sum(np.power(t/a,b) * np.log(t/a)))      

    return np.array([grad_a, grad_b])

def hessian(a,b):  
    hess_aa = N*b/a**2 + (b*(b+1)/a**2)*np.sum(np.power(t/a,b))
    hess_bb = N/b**2 + np.sum(np.power(t/a,b) * np.power(np.log(t/a),2))
    hess_ab = ....
    hess_ba = hess_ab
    hessians = np.array([[hess_aa, hess_ab],[hess_ba, hess_bb]]) 
    
    return hessians

I hope these comments help you further! Do note that Python has libraries to numerically find the optimum of the log likelihood. See for instance the scipy library, function 'minimize'.

thank you so much! Yes, indeed, I calculated elements of the hessian matrix by hand and now my NR goes to something! It is was naive of me not to recheck the derivatives in the paper, so weird it got published/// — kittycat, Mar 25 '21 at 21:58

Estimating Weibull with MLE and Newton-Raphson

1 Answers1