I'm working on implementing a probability density function of a multivariate Gaussian in C++, and I'm stuck on how to best handle the cases where dimension > 2.
The pdf of a gaussian can be written as
where (A)' or A' represents the transpose of the 'matrix' created by subtracting the mean from all elements of x. In this equation, k is the number of dimensions we have, and sigma represents the covariance matrix, which is a k x k matrix. Finally, |X| means the determinant of the matrix X.
In the univariate case, implementing the pdf is trivial. Even in the bivariate (k = 2) case, it's trivial. However, when we go further than two dimensions, the implementation is much harder.
In the bivariate case, we'd have
where rho is the correlation between x and y, with correlation equal to
In this case, I could use Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic>
to implement the first equation, or just calculate everything myself using the second equation, without benefiting from Eigen's simplified linear algebra interface.
My thoughts for an attempt at the multivariate case would probably begin by extending the above equations to the multivariate case
with
My questions are:
- Would it be appropriate/advised to use a
boost::multi_array
for the n-dimensional array, or should I instead try to leverage Eigen? - Should I have separate functions for the univariate/bivariate cases, or should I just abstract it all to the multivariate case using boost::multi_array (or an appropriate alternative)?