I have calculated loads for bridges and I want to fit the Gumbel's distribution to highest 20% of them using maximum likelihood estimate. I need help calculating parameters for the distribution. I have read through scipy.optimize documentation but I can't uderstand how to apply functions in there for estimating two parameter function.
Here is a bit of theory that might help: There are two likelihood functions (L1 and L2), one for values above some threshold (x>=C) and one for values below (x < C), now the likeliest parameters are those at the max value of mutiplication between two functions max(L1*L2). In this case the L1 is still the product of multiplication of probability density function's values at xi, but the L2 is the probability that threshold value C will be exceeded (1-F(C)).
Here's some code I have written:
non_truncated_data = ([15.999737471905252, 16.105716234887431, 17.947809230275304, 16.147752064149291, 15.991427126788327, 16.687542227378565, 17.125139229445359, 19.39645340792385, 16.837044960487795, 15.804473320190725, 16.018569387471025, 16.600876724289019, 16.161306985203151, 17.338636901595873, 18.477371969176406, 17.897236722220281, 16.626465201654593, 16.196548622931672, 16.013794215070927, 16.30367884232831, 17.182106070966608, 18.984566931768452, 16.885737663740024, 16.088051117522948, 15.790480003140173, 18.160947973898388, 18.318158853376037])
threshold = 15.78581825859324
def maximum_likelihood_function(non_truncated_loads, threshold, loc, scale):
"""Calculates maximum likelihood function's value for given truncated data
with given parameters.
Maximum likelihood function for truncated data is L1 * L2. Where L1 is a
product of multiplication of pdf values at non-truncated known values
(non_truncated_values). L2 is a the probability that threshold value will
be exceeded.
"""
is_first = True
# calculates L1
for x in non_truncated_loads:
if is_first:
L1 = gumbel_pdf(x, loc, scale)
is_first = False
else:
L1 *= gumbel_pdf(x, loc, scale)
# calculates L2
cdf_at_threshold = gumbel_cdf(threshold, loc, scale)
L2 = 1 - cdf_at_threshold
return L1*L2
def gumbel_pdf(x, loc, scale):
"""Returns the value of Gumbel's pdf with parameters loc and scale at x .
"""
# exponent
e = math.exp(1)
# substitute
z = (x - loc)/scale
return (1/scale) * (e**(-(z + (e**(-z)))))
def gumbel_cdf(x, loc, scale):
"""Returns the value of Gumbel's cdf with parameters loc and scale at x.
"""
# exponent
e = math.exp(1)
return (e**(-e**(-(x-loc)/scale)))