6
x = [1 2 3 3 4]
cdfplot(x)

After Googling, I find the above code will draw a cumulative distribution function for me in Matlab.
Is there a simple way to draw a probability density function?

To Clarify. I need a graph that has an evenly distributed x-axis. And I would prefer it does not look like a bar graph. (I would have millions of integers)
Sorry, update again. My data are integers, but actually they represents time(I expect several quite high peak at exact same value while other value should look like as if they are not discrete). I'm actually starting to wonder if this is actually not discrete integers inherently. CDF would definitely work, but when coming to PDF, it seems it's more complicated than I anticipated.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
Haozhun
  • 6,331
  • 3
  • 29
  • 50

4 Answers4

7

If you want a continuous distribution function, try this.

x = [1 2 3 3 4]
subplot(2,1,1)
ksdensity(x)
axis([-4 8 0 0.4])

subplot(2,1,2)
cdfplot(x)
grid off
axis([-4 8 0 1])
title('')

Which outputs this. enter image description here

The Cumulative Distribution Function is on the bottom, the Kernel Density Estimate on the top.

Iman
  • 17,932
  • 6
  • 80
  • 90
FriskyGrub
  • 979
  • 3
  • 14
  • 25
7

You can generate a discrete probability distribution for your integers using the function hist:

data = [1 2 3 3 4];           %# Sample data
xRange = 0:10;                %# Range of integers to compute a probability for
N = hist(data,xRange);        %# Bin the data
plot(xRange,N./numel(data));  %# Plot the probabilities for each integer
xlabel('Integer value');
ylabel('Probability');

And here's the resulting plot:

enter image description here


UPDATE:

In newer versions of MATLAB the hist function is no longer recommended. Instead, you can use the histcounts function like so to produce the same figure as above:

data = [1 2 3 3 4];
N = histcounts(data, 'BinLimits', [0 10], 'BinMethod', 'integers', 'Normalization', 'pdf');
plot(N);
xlabel('Integer value');
ylabel('Probability');
gnovice
  • 125,304
  • 15
  • 256
  • 359
  • 4
    @gnovice: just a minor point that you should, in general, divide by the _area_ of the histogram and not the _number of data points_ to get a pdf. So the last line should read `bar(X,N/trapz(X,N))`. Since in this example, the bin points are integers and unit spaced, both `numel` and `trapz` give the same answer, `4`, but if this is not the case, they will be different. – abcd Apr 22 '11 at 16:57
  • @yoda: You are correct, but Gene mentioned having to do this for *integer* values (i.e. a discrete probability distribution) so I thought I'd keep it simple. – gnovice Apr 22 '11 at 17:03
  • Thank you for your answer, I've got one more question, gnovice. @yoda's comment raised my concern. Will this still work correctly if x=[100 200 400 400 550] – Haozhun Apr 22 '11 at 17:20
  • I'll try both on my actual data. Thank you all! – Haozhun Apr 22 '11 at 17:24
  • @Gene: Yes it will. I'm sorry if my comment confused you, but to see what I meant, you could take a look at [my answer](http://stackoverflow.com/questions/5320677/how-to-normalize-a-histogram-in-matlab/5321546#5321546) to an earlier question on normalizing histograms. If you run the code in there, it will illustrate the point I was trying to make. If all you have are discrete integers, then you'll be fine with dividing by `numel`. In either case, `trapz` will give you the correct answer. – abcd Apr 22 '11 at 17:25
  • 1
    @Gene: If you had `data = [100 200 400 400 550];` and specified a range of integers like `xRange = 0:600;`, you would get a plot that was mostly 0 except for spikes of 0.2 when x equals 100, 200, and 550 and a spike of 0.4 when x equals 400. As an alternative way to display your data, you may want to try a [STEM](http://www.mathworks.com/help/techdoc/ref/stem.html) plot instead of a regular line plot. It may look better. – gnovice Apr 22 '11 at 17:33
  • @yoda and gnovice: My data are integers, but actually they represents time(I expect several quite high peak at exact same value while other value should look like as if they are not discrete). I'm actually starting to wonder if this is actually not discrete integers inherently. CDF would definitely work, but when coming to PDF, it seems it's more complicated than I anticipated. Do you have any idea? – Haozhun Apr 22 '11 at 19:16
  • @gnovice : it has been long since you have answered this question but how could I do if I haven't integers on the x axis ? Thanks a lot :) `cdfplot` and `ksdensity` don't work in my version of matlab – mwoua Jul 15 '13 at 13:11
2

type "ksdensity" in matlab help and you will find out the function that will give you the continuous form of PDF. I guess this is exactly what you are looking for.

Ali
  • 31
  • 1
0

In addition to the smooth PDF obtained by ksdensity(x), you can also plot a smooth CDF plot using ksdensity(x,'function','cdf').

enter image description here

Michael Dodd
  • 10,102
  • 12
  • 51
  • 64