equivalent of betadist of excel in python

Question

what is the equivalent of =BETADIST(0.5,8,3) of excel in python ?

I have tried using scipy.stats and unable to achieve same results..

" have tried using scipy.stats and unable to achieve same results" what _specifically_ have you tried, and what went wrong with your attempts? A question should include a [mcve] so that we can offer specific answers and help — G. Anderson, Feb 24 '23 at 23:03

plasma_r · Answer 1 · 2023-02-25T12:58:06.550

As G. Anderson points out this depends on your specific application. For a start you can define a beta distribution in scipy, draw some samples and and plot them as follows:

import scipy
import numpy as np
import matplotlib.pyplot as plt

# the true distribution parameters
a_true = 2
b_true = 3

# define a beta distribution and draw some samples
beta_true = scipy.stats.beta(a=a_true, b=b_true)
y_obs = beta_true.rvs(size=500)

x = np.linspace(0, 1, 100)

# plot the samples and PDF
plt.plot(x, beta_true.pdf(x), lw=2, color="r", label='PDF beta_true')
plt.hist(y_obs, bins=20, density=True, label="samples")
plt.xlabel("x")
plt.ylabel("Density")
plt.legend()
plt.show()

If what you want is to infer parameters for a stochastic process that you assume follows a beta distribution you could use pymc:

import pymc as pm
import arviz as az

model = pm.Model()

with model:
    # priors for a and b
    a = pm.HalfNormal("a", sigma=1)
    b = pm.HalfNormal("b", sigma=1)
    
    # observed RV
    y = pm.Beta("y", alpha=a, beta=b, observed=y_obs)
    
    tr = pm.sample(1000, tune=2000, chains=4, cores=8, return_inferencedata=True)

>>> az.summary(tr, round_to=4)

    mean    sd  hdi_3%  hdi_97%     mcse_mean   mcse_sd     ess_bulk    ess_tail    r_hat
a   1.9152  0.1089  1.7109  2.1217  0.0035  0.0025  973.6684    1550.1573   1.0024
b   2.9142  0.1751  2.5962  3.2444  0.0057  0.0040  949.3716    1590.1557   1.0036

# estimated beta distribution
a_hat, b_hat = az.summary(tr, round_to=4)["mean"]
beta_estimated = scipy.stats.beta(a=a_hat, b=b_hat)

# plot the PDFs of the true and the estimated distributions of RV "y"
plt.hist(y_obs, bins=20, density=True, label="samples")
plt.plot(x, beta_estimated.pdf(x), lw=2, label='PDF beta_estimated')
plt.plot(x, beta_true.pdf(x), lw=2, color="r", label='PDF beta_true')
plt.xlabel("x")
plt.ylabel("Density")
plt.legend()
plt.show()

Please specify your question so we can give specific advice in return.

score 1 · Answer 2 · answered Feb 25 '23 at 02:02

There are two functions in SciPy that provide the same calculation as Excel's BETADIST(x, alpha, beta): the function scipy.special.btdtr(alpha, beta, x) and the cdf method of the distribution scipy.stats.beta. (Note that btdtr has parameters in a different order.)

In [13]: from scipy.special import btdtr

In [14]: from scipy.stats import beta

Either of these computes the same result (other than a tiny numerical difference that is the result of floating point imprecision):

In [15]: btdtr(8, 3, 0.5)
Out[15]: 0.054687500000000014

In [16]: beta.cdf(0.5, 8, 3)
Out[16]: 0.0546875

If all you need is the CDF for the standard beta distribution, then use scipy.special.btdtr, because beta.cdf(0.5, 8, 3) is 50 times slower than btdtr(8, 3, 0.5):

In [18]: %timeit btdtr(8, 3, 0.5)
13.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [19]: %timeit beta.cdf(0.5, 8, 3)
673 µs ± 16.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

If you will eventually need the PDF, or a function to fit the beta distribution to data, or you need to set the bounds of the support to arbitrary points, then look that scipy.stats.beta.

equivalent of betadist of excel in python

2 Answers2