0

I'm experimenting with Gaussian distribution and its likelihood. To figure out max likelihood I differentiate likelihood with respect to mu (expectation) and sigma (mean), which equal to data.mean() and data.std() correspondingly

import matplotlib.pyplot as plt
import numpy as np
import matplotlib.mlab as mlab
import math
from scipy.stats import norm

def calculate_likelihood(x, mu, sigma): 
    n = len(x)  
    likelihood = n/2.0 * np.log(2 * np.pi) + n/2.0 * math.log(sigma **2 ) + 1/(2*sigma**2) * sum([(x_i - mu)**2 for x_i in x ])

    return likelihood

def estimate_gaussian_parameters_from_data(data):
    return data.mean(), data.std()

def main():
    mu = 0
    sigma = 2
    x_values = np.linspace(mu - 3*sigma, mu + 3*sigma, 1000)
    y_values_1 = mlab.normpdf(x_values, mu, sigma)

    estimated_mu, estimated_sigma = estimate_gaussian_parameters_from_data(y_values_1)

if (__name__ == "__main__"):
    main()

I expected that estimated_mu and estimated_sigma should approximately be equal to mu and sigma, but that's not the case. Instead of 0 and 2 I get 0.083 and 0.069. Do I understand anything wrong?

amplifier
  • 1,793
  • 1
  • 21
  • 55

1 Answers1

1

mlab.normpdf is a pdf it returns the probability of x. Since mean is 0 you will see points around 0 having high probability. y_values_1's are the probability densities.

s = np.random.normal(0, 2, 1000)

The above code samples 1000 points which are normally distributed with mean 0 and std 2

np.mean(s) == 0.018308805079364696 and np.std(s) == 1.9467605916031896
mujjiga
  • 16,186
  • 2
  • 33
  • 51
  • You're hitting the nail's head, but please do not call values of `y` probabilities. They are probability densities. – dedObed Mar 07 '19 at 19:36