After taking a couple advanced statistics courses, I decided to code some functions/classes to just automate estimating parameters for different distributions via MLE. In Matlab, the below is something I easily coded once:
function [ params, max, confidence_interval ] = newroutine( fun, data, guesses )
lh = @(x,data) -sum(log(fun(x,data))); %Gets log-likelihood from user-defined fun.
options = optimset('Display', 'off', 'MaxIter', 1000000, 'TolX', 10^-20, 'TolFun', 10^-20);
[theta, max1] = fminunc(@(x) lh(x,data), guesses,options);
params = theta
max = max1
end
Where I just have to correctly specify the underlying pdf equation as fun
, and with more code I can calculate p-values, confidence-intervals, etc.
With Python, however, all the sources I've found on MLE automation (for ex., here and here) insist that the easiest way to do this is to delve into OOP using a subclass of statsmodel
's, GenericLikelihoodModel
, which seems way too complicated for me. My reasoning is that, since the log-likelihood can be automatically created from the pdf (at least for the vast majority of functions), and scipy.stats."random_dist".fit()
already easily returns MLE estimates, it seems ridiculous to have to write ~30 lines of class code each time you have a new dist. to fit.
I realize that doing it the way the two links suggests allows you to automatically tap into statsmodel's
functions, but it honestly does not seem simpler than tapping into scipy
oneself and writing much simpler functions.
Am I missing an easier way to perform basic MLE, or is there a real good reason for the way statsmodels
does this?