-1

I made the PDF which is this hist code below;

plt.figure()

values1,bins1,_ = plt.hist(np.log10(fakeclusterlum),bins=20)

plt.hist(np.log10(bigclusterlum151mh),alpha = .5,bins = bins1)

but I am not sure how to plot this to make it into a CDF? I want to plot the fakeclusterlum and bigclusterlum151mh points. if that makes sense if it doesn't I apologise, I am somewhat of a beginner!

Sagar Zala
  • 4,854
  • 9
  • 34
  • 62
  • Hi! So the short answer is that the CDF is the integral over the PDF. That means, in terms of a histogram CDF(x)=sum(rectangles until x). However, that means that the more bins you have, the more accurate your CDF is, assuming a continuous PDF. But I am not sure what exactly you are trying to achieve. Could you elaborate a bit more on the type of your data and what exactly you want to do with it. Also, some plots of your data would help. – alexblae Oct 10 '18 at 11:10

1 Answers1

0

pyplot.hist has an argument

cumulative : bool, optional
If True, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values. The last bin gives the total number of datapoints.
Default: False

Hence use

plt.hist(..., cumulative=True)

to plot a cumulative histogram.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712