Maximum Likelihood Estimation (MLE) is one of the most important procedure to obtain point estimates for parameters of a distribution. This is what you need to start with.
Analytical Solution:
Multinational distribution is an extension to binomial distribution for which MLE can be obtained analytically. Refer this math stack exchange post (MLE for Multinomial Distribution) for full analytical solution. The procedure starts with defining a likelihood function, L(p) conditioned on observed data x(i), where p and x are the probabilities and observed occurrences for k classes/ categories and i= 0,1,...k. Its a measure of likelihood of observing a set of observations (x) given parameter set (p):
L(p) equals:

The main idea is to maximize the likelihood function value over the range of parameters (p). Given the total observations n (i.e. sum of occurrences for all categories), the point estimates equals:
a.values/a.values.sum() # point estimates for p = x/n
# array([[0. ], [0.02941176], [0.05882353], [0.08823529],
# [0.05882353], [0.02941176], [0.17647059], [0. ],
# [0.02941176], [0.02941176], [0.20588235], [0.29411765]])
Numerical Solution:
The above results can also be numerically obtained using scipy.optimize.minimize
. Notice that L(p) is a product of factorial and exponential terms. The factorial term is a constant and does not depends on the parameter values (p), therefore not considered for optimization. For the exponential terms, it is better to perform a log transformation to simplify the objective function; common practice for MLE, as log is a monotone increasing function. Also, as the scipy.optimize.minimize
is used for minimization, we will use the negative of our log transformed likelihood function. Note that maximizing a function value is equal to minimizing its negative value.
import pandas as pd
import numpy as np
import scipy.optimize as sciopt
# bounds for parameters to lie between (0,1),
# absolute zero (0) for lower bound avoided as log takes an infinite value
bnds = [(0.001e-12,1) for i in range(12)]
# Initializing parameters value for optimization
init_parameters = np.asarray([0.1 for i in range(12)])
# Negative Log Likelihood Function
neg_log_lik = lambda p: -np.sum([a.values[i]*np.log(p[i]) for i in range(12)])
# Constraint sum(p) = 1
cons = {'type': 'eq', 'fun': lambda p: (sum([p[i] for i in range(12)]) - 1) }
# Minimizing neg_log_lik
results = sciopt.minimize(neg_log_lik, x0 = init_parameters,
method='SLSQP', bounds= bnds, constraints= cons)
results.x # point estimates for p
# array([1.00000000e-15, 2.94179308e-02, 5.88243586e-02, 8.82394605e-02,
# 5.88243586e-02, 2.94059735e-02, 1.76454713e-01, 1.00000000e-15,
# 2.94134577e-02, 2.94135714e-02, 2.05849197e-01, 2.94156978e-01])
Refer scipy.optimize.minimize documentation for details on above implementation.