Maximum Likelihood Estimation with statsmodels overcomplicates things? Hoping for Recommendations

Question

After taking a couple advanced statistics courses, I decided to code some functions/classes to just automate estimating parameters for different distributions via MLE. In Matlab, the below is something I easily coded once:

function [ params, max, confidence_interval ] = newroutine( fun, data, guesses )

lh = @(x,data) -sum(log(fun(x,data))); %Gets log-likelihood from user-defined fun.

options = optimset('Display', 'off', 'MaxIter', 1000000, 'TolX', 10^-20, 'TolFun', 10^-20);
[theta, max1] = fminunc(@(x) lh(x,data), guesses,options);
params = theta 
max = max1

end

Where I just have to correctly specify the underlying pdf equation as fun, and with more code I can calculate p-values, confidence-intervals, etc.

With Python, however, all the sources I've found on MLE automation (for ex., here and here) insist that the easiest way to do this is to delve into OOP using a subclass of statsmodel's, GenericLikelihoodModel, which seems way too complicated for me. My reasoning is that, since the log-likelihood can be automatically created from the pdf (at least for the vast majority of functions), and scipy.stats."random_dist".fit() already easily returns MLE estimates, it seems ridiculous to have to write ~30 lines of class code each time you have a new dist. to fit.

I realize that doing it the way the two links suggests allows you to automatically tap into statsmodel's functions, but it honestly does not seem simpler than tapping into scipy oneself and writing much simpler functions.

Am I missing an easier way to perform basic MLE, or is there a real good reason for the way statsmodels does this?

*"... all the sources I've found on MLE automation (for ex., here and here) insist that the easiest way to do this is to delve into OOP using a subclass of statsmodel's, GenericLikelihoodModel"* The method shown in the section "Python Handcoded with SciPy" in [your first link](http://rlhick.people.wm.edu/posts/estimating-custom-mle.html) doesn't require using `statsmodels`. (It is true that the article recommends `statsmodels` over that method, though.) — Warren Weckesser, Apr 08 '18 at 02:58
If all you care about are the parameter estimates, then just using scipy is enough. If you want standard errors and inference, then writing a few extra lines to make a subclass with statsmodels is still a lot less than handcoding everything with scipy. — Josef, Apr 08 '18 at 04:33
It may be just me, but I feel like coding those ~30 lines each time, which don't look trivial, seems inefficient to me. That is why I am thinking there has to be a better way. — Coolio2654, Apr 08 '18 at 05:44
@Coolio2654 If you just want to wrap many scipy distributions, then writing a subclass for each is, then writing one subclass for each distribution is not the shortest way. You could make one subclass that takes a scipy distribution as `__init__` argument, and make loglike use the logpdf method. (I would like to add that to statsmodels.) — Josef, Apr 08 '18 at 12:58

Rob Hicks · Accepted Answer · 2018-04-17T17:54:21.553

I wrote the first post outlining the various methods, and I think it is fair to say that while I recommend the statsmodels approach, I did so to leverage the postestimation tools it provides and to get standard errors every time a model is estimated.

When using minimize, the python equivalent of fminunc (as you outline in your example), oftentimes I am forced to use "Nelder-Meade" or some other gradiant-free method to get convergence . Since I need standard errors for statistical inference, this entails an additional step using numdifftools to recover the hessian. So in the end, the method you propose has its complications too (for my work). If all you care about is the maximum likelihood estimate and not inference, then the approach you outline is probably best and you are correct that you don't need the machinery of statsmodel.

FYI: in a later post, I use your approach combined with autograd for significant speedups of big maximum likelihood models. I haven't successfully gotten this to work with statsmodels.

Thanks for the reply! Did you find this post based on my link to your article? At any rate, thanks for giving the additional context, which is useful. If I may ask a question, you say the additional step in implementing statistical inference for MLE sans `statsmodels` involves calculating the Hessian yourself. Doesn't this only involve an additional 1 or 2 lines of code, unless I'm missing something? And thanks for your link about auto-differentiation, which sounds fascinating, to say the least. — Coolio2654, Apr 17 '18 at 20:06
Yeah, the link tipped me off. The code necessary to add the hessian isn't too cumbersome, it is only a few lines extra lines (see the last code block in the "Python Handcoded using Scipy" section of my blogpost you referenced). — Rob Hicks, Apr 18 '18 at 11:06

Maximum Likelihood Estimation with statsmodels overcomplicates things? Hoping for Recommendations

1 Answers1