Yule–Simon distribution

In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.

Yule–Simon
Probability mass function

Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Cumulative distribution function

Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Parameters shape (real)
Support
PMF
CDF
Mean for
Mode
Variance for
Skewness for
Ex. kurtosis for
MGF does not exist
CF

The probability mass function (pmf) of the Yule–Simon (ρ) distribution is

for integer and real , where is the beta function. Equivalently the pmf can be written in terms of the rising factorial as

where is the gamma function. Thus, if is an integer,

The parameter can be estimated using a fixed point algorithm.

The probability mass function f has the property that for sufficiently large k we have

This means that the tail of the Yule–Simon distribution is a realization of Zipf's law: can be used to model, for example, the relative frequency of the th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of .

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.