How to make a CDF in Python?

Question

I made the PDF which is this hist code below;

plt.figure()

values1,bins1,_ = plt.hist(np.log10(fakeclusterlum),bins=20)

plt.hist(np.log10(bigclusterlum151mh),alpha = .5,bins = bins1)

but I am not sure how to plot this to make it into a CDF? I want to plot the fakeclusterlum and bigclusterlum151mh points. if that makes sense if it doesn't I apologise, I am somewhat of a beginner!

Hi! So the short answer is that the CDF is the integral over the PDF. That means, in terms of a histogram CDF(x)=sum(rectangles until x). However, that means that the more bins you have, the more accurate your CDF is, assuming a continuous PDF. But I am not sure what exactly you are trying to achieve. Could you elaborate a bit more on the type of your data and what exactly you want to do with it. Also, some plots of your data would help. — alexblae, Oct 10 '18 at 11:10

score 0 · Answer 1 · answered Oct 10 '18 at 11:30

pyplot.hist has an argument

cumulative : bool, optional
If True, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values. The last bin gives the total number of datapoints.
Default: False

Hence use

plt.hist(..., cumulative=True)

to plot a cumulative histogram.

How to make a CDF in Python?

1 Answers1