I have 3 questions regarding subclassing of scipy.stats.rv_continuous. My goal is to code a statistical mixture model of a truncated normal distribution, a truncated exponential distribution and 2 uniform distributions.
1) Why is drawing random variates via mm_model.rvs(size = 1000) so extremely slow? I read something about performance issues in the documentary, but I was really surprised.
2) After drawing random variates via mm_model.rvs(size = 1000) I get this IntegrationWarning?
IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. warnings.warn(msg, IntegrationWarning)
3) I read in the documentary that I can transmit parameters to the pdf via the "shape" parameter. I tried to adjust my pdf and set the shape parameter but it did not work. Could someone explain it?
Thanks for any help.
def trunc_norm(z,low_bound,up_bound,mu,sigma):
a = (low_bound - mu) / sigma
b = (up_bound - mu) / sigma
return stats.truncnorm.pdf(z,a,b,loc=mu,scale=sigma)
def trunc_exp(z,up_bound,lam):
return stats.truncexpon.pdf(z,b=up_bound*lam,scale=1/lam)
def uniform(z,a,b):
return stats.uniform.pdf(z,loc=a,scale=(b-a))
class Measure_mixture_model(stats.rv_continuous):
def _pdf(self,z):
z_true = 8
z_min = 0
z_max = 10
p_hit = 0.7
p_short = 0.1
p_max = 0.1
p_rand = 0.1
sig_hit = 1
lambda_short = 0.5
sum_p = p_hit + p_short + p_max + p_rand
if sum_p < 0.99 or 1.01 < sum_p:
misc.eprint("apriori probabilities p_hit, p_short, p_max, p_rand have to be 1!")
return None
pdf_hit = trunc_norm(z,z_min,z_max,z_true,sig_hit)
pdf_short = trunc_exp(z,z_true,lambda_short)
pdf_max = uniform(z,0.99*z_max,z_max)
pdf_rand = uniform(z,z_min,z_max)
pdf = p_hit * pdf_hit + p_short * pdf_short + p_max * pdf_max + p_rand * pdf_rand
return pdf
#mm_model = Measure_mixture_model(shapes='z_true,z_min,z_max,p_hit,p_short,p_max,p_rand,sig_hit,lambda_short')
mm_model = Measure_mixture_model()
z = np.linspace(-1,11,1000)
samples = mm_model.pdf(z)
plt.plot(z,samples)
plt.show()
rand_samples = mm_model.rvs(size = 1000)
bins = np.linspace(-1, 11, 100)
plt.hist(rand_samples,bins)
plt.title("measure mixture model")
plt.xlabel("z: measurement")
plt.ylabel("p: relative frequency")
plt.show()