I'm writing a MathematicalProgram that requires evaluating the likelihood of sampling a normal distribution and getting a value close to a certain number. Ideally, I would want to write something like this, which uses the scipy.stats norm class:
prog.AddCost(log_likelihood_cost, vars=x)
def log_likelihood_cost(x):
cdf_upper = norm.cdf(x + eps)
cdf_lower = norm.cdf(x - eps)
observation_likelihood = cdf_upper - cdf_lower
return -np.log(observation_likelihood)
However, scipy doesn't play nicely with numpy arrays of dtype "object"---the above code throws an error because the "isnan" operation can't be safely applied. I can think of a few not very statisfying workarounds:
- Instead of integrating around the value, using the probability density as the basis for cost. This is bad because I'd like the cost to be positive and the density is unbounded if we vary sigma.
- Approximating the integral with numpy code, perhaps using a trapeze method, maybe also caching the cdf over a grid and interpolating. This feels a bit like reinventing the wheel, and it'd also be much more bug-prone.
Ideally, I'd avoid resorting to either of those. I've looked into kGaussian and the likes but it didn't seem like there's methods to compute a cdf, at least among those exposed in python. What would be the pydrake way of doing this?