1

In Python, I have estimated the parameters for the density of a model of my distribution and I would like to plot the density function above the histogram of the distribution. In R it is similar to using the option prop=TRUE.

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt

# initialization of the list "data"
# estimation of the parameter, in my case, mean and variance of a normal distribution

plt.hist(data, bins="auto") # data is the list of data
# here I would like to draw the density above the histogram
plt.show()

I guess the trickiest part is to make it fit.

Edit: I have tried this according to the first answer:

mean = np.mean(logdata)
var  = np.var(logdata)
std  = np.sqrt(var) # standard deviation, used by numpy as a replacement of the variance
plt.hist(logdata, bins="auto", alpha=0.5, label="données empiriques")
x = np.linspace(min(logdata), max(logdata), 100)
plt.plot(x, mlab.normpdf(x, mean, std))
plt.xlabel("log(taille des fichiers)")
plt.ylabel("nombre de fichiers")
plt.legend(loc='upper right')
plt.grid(True)
plt.show()

But it doesn't fit the graph, here is how it looks: What I get with the python code above. I would like the density to fit the histogram but the values are too small.

** Edit 2 ** Works with the option normed=True in the histogram function. Something that would look like this (done in R with the optionprob=TRUE)

  • It's not clear what you are looking for. Can you add screenshots/more information on how you would like the figure to look? As always, a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve) would massively help your chances of getting a good answer. – DavidG Oct 23 '17 at 12:23
  • Here it is. Yes, you are right an example is better than a thousand explications ! But still I thought my explication was pretty clear... – Nicolas Scotto Di Perto Oct 23 '17 at 13:19
  • I have added a solution if you don't want to use `normed = True`. – DavidG Oct 23 '17 at 14:08

2 Answers2

2

If I understand you correctly you have the mean and standard deviation of some data. You have plotted a histogram of this and would like to plot the normal distribution line over the histogram. This line can be generated using matplotlib.mlab.normpdf(), the documentation can be found here.

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt

mean = 100
sigma = 5

data = np.random.normal(mean,sigma,1000) # generate fake data
x = np.linspace(min(data), max(data), 100)

plt.hist(data, bins="auto",normed=True)
plt.plot(x, mlab.normpdf(x, mean, sigma))

plt.show()

Which gives the following figure:

enter image description here

Edit: The above only works with normed = True. If this is not an option, we can define our own function:

def gauss_function(x, a, x0, sigma):
    return a * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))

mean = 100
sigma = 5

data = np.random.normal(mean,sigma,1000) # generate fake data
x = np.linspace(min(data), max(data), 1000)

test = gauss_function(x, max(data), mean, sigma)

plt.hist(data, bins="auto")
plt.plot(x, test)

plt.show()
DavidG
  • 24,279
  • 14
  • 89
  • 82
0

All what you are looking for, already are in seaborn.

You just have to use distplot

import seaborn as sns
import numpy as np

data = np.random.normal(5, 2, size=1000)
sns.distplot(data)

plot is here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158