1

How can I estimate/fit parameters for exponentially modified Gaussian distribution (exGaussian) using JAVA/Android?

I need something like following pseudo code:

    // some observed data points
    double dataPoints[] = {200,300,400,278,366,466,325,335,322,332};

    // ex-gaussian distribution
    ExponentiallyModifiedGaussianDistribution exGaussian = new ExponentiallyModifiedGaussianDistribution();

    // MLE
    MaximumLikelihoodEstimation MLE = new MaximumLikelihoodEstimation(dataPoints, exGaussian);
    MLE.setGuess(3.0, 1.0, 1.0);
    MLE.compute();

    // get estimated / fitted parameters
    double[] parameterEstimates = MLE.getEstimates();

There are some examples demonstrating parameter estimation for the Gamma Distribution. But this lib seems to be not open source.

And I have found an ex-gaussian distribution implementation in JAVA. But the parameter estimation is missing.

I think there are a lot of ways to estimate parameters e.g. using maximum likelihood estimates (MLE) etc.

Update 1:

I will avoid to use less than 40 dataPoints.

Update 2:

An easy alternative to estimate the parameters of the distribution is the method of moments estimation (described on wiki)

lidox
  • 1,901
  • 3
  • 21
  • 40

3 Answers3

1

Maybe you can get a better answer on stats.stackexchange.com.

But I think that you can create an optimization algorithm that does moment matching. Basically you need to minimize the difference of the first moment (mean), second moment (variance), third moment (skewness), etc., between the sample data and the theoretical distribution.

You can treat the objective function as the sum or product of the moments. You can also give use different weights to the moments (the mean receive a higher weight).

You can try to obtain the derivative of the objective function to use the gradient descent method (using Mathematica or Sage math), or even use the finite difference method for computing derivatives numerically. This approach is heavily employed for estimating parameters for regression, logistic regression and ANNs.

And you can use meta-heuristic algorithms (genetic, taboo search, etc) as well.

  • So I can maybe use the Nelder-Mead simplex/optimizer in order to estimate parameters. There is an existing lib in java. – lidox May 16 '17 at 13:13
  • 1
    On the wikipedia article they suggest using the first 3 moments. https://en.wikipedia.org/wiki/Exponentially_modified_Gaussian_distribution "(...) The parameters of the distribution can be estimated from the sample data with the method of moments as follows:[4][5] (...)" – Danilo M. Oliveira May 16 '17 at 13:29
  • well I can calculate the mean and std-deviation. But what about y1? γ1 is the skewness. how can I get the value having only some data points? – lidox May 16 '17 at 13:37
  • y1 = (mean / median) / std-deviation. I found this below the chapter skew. Thanks for your help! – lidox May 16 '17 at 13:46
1

An easy alternative to estimate the parameters of the distribution is the method of moments (described on wiki).

Implementation:

    // some observed data points
    double dataPoints[] = {0.464,0.443,0.424,0.386,0.367,0.382,0.455,0.410,0.411,0.424,0.338,0.355,0.342,0.324,
            0.354,0.322,0.364,0.375,1.085,0.575,0.597,0.464,0.414,0.408,1.156,0.819,1.156,1.024,1.152,1.103,
            0.431,0.378,0.358,0.382,0.354,0.435,0.386,0.361,0.397,0.362,0.334,0.357,0.344,0.362,0.317,0.331,
            0.199,0.351,0.284,0.343,0.354,0.336,0.280,0.312,0.778,0.723,0.755,0.774,0.759,0.762,0.490,0.400,
            0.364,0.439,0.441,0.673};

    DescriptiveStatistics maths = new DescriptiveStatistics(dataPoints);
    double sampleMean = maths.getMean();
    double sampleStdDev = maths.getStandardDeviation();
    double sampleSkev = (Math.abs(sampleMean - maths.getPercentile(50)) / sampleStdDev);

    // parameter estimation using method of moments ex-gaussian distribution
    double mean = sampleMean - sampleStdDev * Math.pow(sampleSkev/2., 1./3) ;
    double stdDev = Math.sqrt(sampleStdDev*(1 - Math.pow(sampleSkev/2., 2./3)));
    double tau = sampleStdDev * (Math.pow(sampleSkev/2., 1./3));
    double lambda = 1 / tau;

    ExponentiallyModifiedGaussianDistribution exGaussian = new ExponentiallyModifiedGaussianDistribution(mean, stdDev, lambda);
    System.out.println(sampleStdDev);
    System.out.println(exGaussian.getStddev());

Gradle Dependencies:

compile group: 'de.lmu.ifi.dbs.elki', name: 'elki', version: '0.7.1'

compile group: 'org.apache.commons', name: 'commons-math3', version: '3.6'

lidox
  • 1,901
  • 3
  • 21
  • 40
1

ELKI contains various estimators for distributions (note that for some distributions we have multiple estimation techniques; and for some we don't have any yet - please contribute!):

ExponentiallyModifiedGaussianDistribution dist = 
    EMGOlivierNorbergEstimator.STATIC.estimate(dataPoints, DoubleArrayAdapter.STATIC);

Will yield an ExGaussian distribution:

> ExGaussianDistribution(mean=0.2675761092764285, stddev=0.07999178722695827,
      lambda=4.4179732613344)

You can also try a best-fit estimate.

Distribution dist = BestFitEstimator.STATIC.estimate(dataPoints, DoubleArrayAdapter.STATIC);

which indicates a shifted Log-Normal may be a better fit for your data (but nevertheless, an EMG can be theoretically more suitable for your problem - a shifted log-normal is always a bit odd).

> LogNormalDistribution(logmean=-1.945322593396174, logstddev=0.968522285758599,
      shift=0.2654438504801123)
Erich Schubert
  • 8,575
  • 2
  • 26
  • 42