what is the equivalent of =BETADIST(0.5,8,3)
of excel in python ?
I have tried using scipy.stats and unable to achieve same results..
what is the equivalent of =BETADIST(0.5,8,3)
of excel in python ?
I have tried using scipy.stats and unable to achieve same results..
As G. Anderson points out this depends on your specific application. For a start you can define a beta distribution in scipy, draw some samples and and plot them as follows:
import scipy
import numpy as np
import matplotlib.pyplot as plt
# the true distribution parameters
a_true = 2
b_true = 3
# define a beta distribution and draw some samples
beta_true = scipy.stats.beta(a=a_true, b=b_true)
y_obs = beta_true.rvs(size=500)
x = np.linspace(0, 1, 100)
# plot the samples and PDF
plt.plot(x, beta_true.pdf(x), lw=2, color="r", label='PDF beta_true')
plt.hist(y_obs, bins=20, density=True, label="samples")
plt.xlabel("x")
plt.ylabel("Density")
plt.legend()
plt.show()
If what you want is to infer parameters for a stochastic process that you assume follows a beta distribution you could use pymc:
import pymc as pm
import arviz as az
model = pm.Model()
with model:
# priors for a and b
a = pm.HalfNormal("a", sigma=1)
b = pm.HalfNormal("b", sigma=1)
# observed RV
y = pm.Beta("y", alpha=a, beta=b, observed=y_obs)
tr = pm.sample(1000, tune=2000, chains=4, cores=8, return_inferencedata=True)
>>> az.summary(tr, round_to=4)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
a 1.9152 0.1089 1.7109 2.1217 0.0035 0.0025 973.6684 1550.1573 1.0024
b 2.9142 0.1751 2.5962 3.2444 0.0057 0.0040 949.3716 1590.1557 1.0036
# estimated beta distribution
a_hat, b_hat = az.summary(tr, round_to=4)["mean"]
beta_estimated = scipy.stats.beta(a=a_hat, b=b_hat)
# plot the PDFs of the true and the estimated distributions of RV "y"
plt.hist(y_obs, bins=20, density=True, label="samples")
plt.plot(x, beta_estimated.pdf(x), lw=2, label='PDF beta_estimated')
plt.plot(x, beta_true.pdf(x), lw=2, color="r", label='PDF beta_true')
plt.xlabel("x")
plt.ylabel("Density")
plt.legend()
plt.show()
Please specify your question so we can give specific advice in return.
There are two functions in SciPy that provide the same calculation as Excel's BETADIST(x, alpha, beta)
: the function scipy.special.btdtr(alpha, beta, x)
and the cdf
method of the distribution scipy.stats.beta
. (Note that btdtr
has parameters in a different order.)
In [13]: from scipy.special import btdtr
In [14]: from scipy.stats import beta
Either of these computes the same result (other than a tiny numerical difference that is the result of floating point imprecision):
In [15]: btdtr(8, 3, 0.5)
Out[15]: 0.054687500000000014
In [16]: beta.cdf(0.5, 8, 3)
Out[16]: 0.0546875
If all you need is the CDF for the standard beta distribution, then use scipy.special.btdtr
, because beta.cdf(0.5, 8, 3)
is 50 times slower than btdtr(8, 3, 0.5)
:
In [18]: %timeit btdtr(8, 3, 0.5)
13.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [19]: %timeit beta.cdf(0.5, 8, 3)
673 µs ± 16.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
If you will eventually need the PDF, or a function to fit the beta distribution to data, or you need to set the bounds of the support to arbitrary points, then look that scipy.stats.beta
.