2

How would you normalize a histogram A so the sum of each bin is 1

Dividing the histogram by the width of the bin, how do you draw it

I have this

dist        = rand(50)    
average     = mean(dist, 1);   
[c,x]       = hist(average, 15);    
normalized  = c/sum(c);
bar(x, normalized, 1)

In this case, n = 50,

  • What is it the formula to get values of mean and variance^2? We write N(mean, (variance^2) / 50), but how?
  • How do you plot both uniform distribution and normal distribution?.

The histogram must be close to the normal distribution.

nbro
  • 15,395
  • 32
  • 113
  • 196
edgarmtze
  • 24,683
  • 80
  • 235
  • 386

1 Answers1

6

That is a very unusual way of normalizing a probability density function. I assume you want to normalize such that the area under the curve is 1. In that case, this is what you should do.

[c,x]=hist(average,15);
normalized=c/trapz(x,c);
bar(x,normalized)

Either way, to answer your question, you can use randn to generate a normal distribution. You're now generating a 50x50 uniform distribution matrix and summing along one dimension to approximate a normal Gaussian. This is unnecessary. To generate a normal distribution of 1000 points, use randn(1000,1) or if you want a row vector, transpose it or flip the numbers. To generate a Gaussian distribution of mean mu and variance sigma2, and plot its pdf, you can do (an example)

mu=2;
sigma2=3;
dist=sqrt(sigma2)*randn(1000,1)+mu;
[c,x]=hist(dist,50);
bar(x,c/trapz(x,c))

Although these can be done with dedicated functions from the statistics toolbox, this is equally straightforward, simple and requires no additional toolboxes.

EDIT

I missed the part where you wanted to know how to generate a uniform distribution. rand, by default gives you a random variable from a uniform distribution on [0,1]. To get a r.v. from a uniform distribution between [a, b], use a+(b-a)*rand

abcd
  • 41,765
  • 7
  • 81
  • 98
  • M.: Care to explain why your way to normalize is the right way to do it? In the practical context `the area under the curve is 1` it's not so clear why OP's `c/sum(c)` is not sufficient. Thanks – eat Mar 14 '11 at 22:31
  • 3
    The definition of a probability density function is such that the area under the curve is 1. In the OP's case, the total number of occurrances is normalized to 1, not the area. It is as good as doing `f/N`, where `N` is the number of elements in the vector (50 in this case). Sure, it is a histogram, but not a density. You can plot both out in MATLAB and see the difference. Agreed, OP never asked for a density, and my comment was just an observation, separate from the answer. – abcd Mar 14 '11 at 22:38
  • M.: How would be the density? – edgarmtze Mar 15 '11 at 00:49
  • @darkcminor: the density is what I have shown. The difference between a histogram and a density is just in the scaling. If you scale it to unit area, it is called a density. The `trapz` function computes the area and you're dividing the histogram bin values by the area. If you recalculate the area, you'll get 1. – abcd Mar 15 '11 at 00:56