0

I have the following model in which it uses Jeffrey's prior for Geometric distribution for one of the distributions. I have one of the distributions that is derived from the others and uses that in Mixture distribution. I am getting

IndexError: axis 1 is out of bounds [-1, 1) error.

Please find my code below.

actual_visit_mod_90_days = df_visit_period['mod_90_days'].values

def jeffreys_logp(theta):
    #log(((1-theta)**-1) * theta**-0.5 )
    return -(T.log(1-theta) + 0.5 * T.log(theta))


with Model() as pattern_study:

    dist1 = Binomial('dist1',n=14,p=0.5)
    dist2 = DiscreteUniform('dist2',lower = 0, upper = 89)
    pp = pm.DensityDist('pp',jeffreys_logp, testval = 0.05)
    dist3 = Geometric('dist3',pp)

    someDervDist =  pm.DensityDist('someDervDist', lambda x: dist2-dist1+dist3 -1)
    likelihoodDist1 = DiscreteUniform.dist(lower=0, upper=89)

    likelihoodDist2 = pm.DensityDist.dist(lambda x: pm.logsumexp(someDervDist))

    p=0.05

    p_vd = Mixture('p_vd',[p,1-p],[likelihoodDist1,likelihoodDist2],observed = actual_visit_mod_90_days)

The Error message is

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-1c9d266976ce> in <module>()
     28     p=0.05
     29 
---> 30     p_vd = Mixture('p_vd',[p,1-p],[likelihoodDist1,likelihoodDist2],observed = actual_visit_mod_90_days)
     31 
     32     #print pm.distributions.generate_samples(postponeProbability)

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\pymc3\distributions\distribution.pyc in __new__(cls, name, *args, **kwargs)
     35             total_size = kwargs.pop('total_size', None)
     36             dist = cls.dist(*args, **kwargs)
---> 37             return model.Var(name, dist, data, total_size)
     38         else:
     39             raise TypeError("Name needs to be a string but got: {}".format(name))

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\pymc3\model.pyc in Var(self, name, dist, data, total_size)
    780                 var = ObservedRV(name=name, data=data,
    781                                  distribution=dist,
--> 782                                  total_size=total_size, model=self)
    783             self.observed_RVs.append(var)
    784             if var.missing_values:

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\pymc3\model.pyc in __init__(self, type, owner, index, name, data, distribution, total_size, model)
   1228 
   1229             self.missing_values = data.missing_values
-> 1230             self.logp_elemwiset = distribution.logp(data)
   1231             # The logp might need scaling in minibatches.
   1232             # This is done in `Factor`.

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\pymc3\distributions\mixture.pyc in logp(self, value)
    111         w = self.w
    112 
--> 113         return bound(logsumexp(tt.log(w) + self._comp_logp(value), axis=-1).sum(),
    114                      w >= 0, w <= 1, tt.allclose(w.sum(axis=-1), 1),
    115                      broadcast_conditions=False)

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\pymc3\distributions\mixture.pyc in _comp_logp(self, value)
     83         except AttributeError:
     84             return tt.stack([comp_dist.logp(value) for comp_dist in comp_dists],
---> 85                             axis=1)
     86 
     87     def _comp_means(self):

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\theano\tensor\basic.pyc in stack(*tensors, **kwargs)
   4583         dtype = scal.upcast(*[i.dtype for i in tensors])
   4584         return theano.tensor.opt.MakeVector(dtype)(*tensors)
-> 4585     return join(axis, *[shape_padaxis(t, axis) for t in tensors])
   4586 
   4587 

C:\Users\mdevananda\Continuum\Anaconda2\lib\site-packages\theano\tensor\basic.pyc in shape_padaxis(t, axis)
   4475     if not -ndim <= axis < ndim:
   4476         msg = 'axis {0} is out of bounds [-{1}, {1})'.format(axis, ndim)
-> 4477         raise IndexError(msg)
   4478     if axis < 0:
   4479         axis += ndim

IndexError: axis 1 is out of bounds [-1, 1)

To me, it seems that the problem is with DensityDist.dist function. I am not sure if I am missing anything here either.

Edit update: I am trying to learn after how many days from a start date will the event1 occur, given the historic the pattern of event1 (actual_visit_mod_90_days). Initially, I defined someDervDist as an equation :

someDervDist = dist2-dist1+dist3-1 

that gives the relation beween various probabilities with respect to events that decide event1.

likelihoodDist2 = pm.DensityDist.dist(lambda x: pm.logsumexp(someDervDist))

will give the logsumexp(someDervDist.logp()) I didn't write as .logp because I thought it would give the log probability (hmmm am I wrong here?) But, I realised this needed log probability and changed that someDervDist eqn into a DensityDist distribution ( I tried with Deterministic too). Then, I am assuming a small probability p=0.05 that this deviates from the model of periodic occurance of event1. Basically I am looking for something like Mixture([p, 1-p], [ DiscreteUniform(lower = 0, upper = 89), someDervDist]).

aL_eX
  • 1,453
  • 2
  • 15
  • 30
manjula
  • 35
  • 7
  • `DensityDist` is expecting a log-probability but you are passing random variables. Could you explain in more detail what is the model you are trying to implement or maybe point to some reference? – aloctavodia Jan 17 '18 at 21:12
  • I thought logsumexp will give logprobability! have added few more info to the question. – manjula Jan 18 '18 at 04:04
  • `someDervDist` is the result of operating on random variates (is like doing `scipy.stats.norm(0, 1).rvs()`), and what you want is the logprobability of a certain value (similar to `scipy.stats.norm(0,1).logp(x)`), thus I guess what you are looking for is something similar to `d = pm.DensityDist('d', lambda x: (pm.Normal.dist(-10, 1).logp(x) + pm.Normal.dist(10, 1).logp(x)))` Why do you define `someDervDist = dist2-dist1+dist3-1 `? Do you have a reference? – aloctavodia Jan 18 '18 at 12:41
  • For the someDervDist, I just tried the way it was mentioned in an example http://docs.pymc.io/notebooks/GLM-linear.html and a few other examples.I changed someDervDist to pm.DensityDist.dist(lambda x: dist2.logp(x) - dist1.logp(x) + dist3.logp(x) -T.log(1)) and using that either directly in Mixture instead of likelihood2 or in likelihood2 as pm.DensityDist.dist(lambda x: pm.logsumexp(someDervDist.logp(x))) and then using it in Mixture throws TypeError: can't turn [TensorConstant{[29. 76... 87. 58.]}] and {} into a dict. cannot convert dictionary update sequence element #0 to a sequence. – manjula Jan 18 '18 at 21:49
  • But I know someDervDist to pm.DensityDist.dist(lambda x: dist2.logp(x) - dist1.logp(x) + dist3.logp(x) -T.log(1)) is wrong bcoz if we assign some values for dist2,dist1,dist3 then i am expecting someDervDist to be dist2-dist1+dist3-1. So i tried with DensityDist.dist (lambda x: T.log(dist2-dist1+dist3-1)) and then it throws this Index Error as mentioned in the question. – manjula Jan 18 '18 at 22:05
  • Could you point me to a reference from where you take the idea to use 'someDervDist = dist2-dist1+dist3-1'? – aloctavodia Jan 18 '18 at 23:48
  • http://docs.pymc.io/notebooks/getting_started.html where it defines mubasic_model = pm.Model() with basic_model: # Priors for unknown model parameters alpha = pm.Normal('alpha', mu=0, sd=10) beta = pm.Normal('beta', mu=0, sd=10, shape=2) sigma = pm.HalfNormal('sigma', sd=1) # Expected value of outcome mu = alpha + beta[0]*X1 + beta[1]*X2 # Likelihood (sampling distribution) of observations Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y). This uses an equation to define mu. – manjula Jan 19 '18 at 01:53
  • Thanks, In that example `mu` is a parameter of a distribution and is defined in terms of other random variables, that's totally fine. What you are doing is different, `DensityDist`defines a new distribution and takes as input and arbitrary expression for a log probability (not a random variable). Now I clarify my previous question. Could you point me to a reference (a scientific paper, book or mathematical model, not necessary a PyMC3 model) from where you take the idea of the model you are trying to implement? – aloctavodia Jan 19 '18 at 11:16
  • The context is a health care domain. The mathematical model: earliestVisit ~ Binomial(n=14,p=0.5) medsRunOut ~ DiscreteUniform(lower = 0, upper = 89) PostponeProb ~ ... Jeffery's prior for Geometric Distribution. numDecisions ~ Geometric(PostponeProb) nonExceptionalVisitDate = medsRunOut - earliestVisit + numDecisions – 1 nonExceptionalVisitDateMod90 ~ DensityDist(lamba x: pm.math.logsumexp([nonExceptionalVisitDate.logp(x)) p = some small constant probability visitDate ~ Mixture([p, 1-p], [ DiscreteUniform(lower = 0, upper = 89), nonExceptionalVisitDateMod90 ]) visitDate is observeddata – manjula Jan 19 '18 at 11:40
  • Sorry for not posting this context in the first place as I was not sure if I could post the context here. This is a probabilistic model I am trying to build as a part of my study so no official scientific paper nor a book to point to for a reference. – manjula Jan 19 '18 at 11:44
  • I am still confused, sorry. How do you think `nonExceptionalVisitDateMod90` is distributed? Could you point to a _simple_ distribution, for example Poisson? Could you use `dist2-dist1+dist3-1 ` as a parameter of such simple distribution? – aloctavodia Jan 19 '18 at 12:42
  • We have nonExceptionalVisitDate = medsRunOut - earliestVisit + numDecisions -1. This is a uniform distribution minus a binomial distribution plus a geometric distribution minus 1. No, there is no simple distribution that this represents! Then the mod 90 distribution is related to that by the mod 90 calculation. Again, this will not correspond to a simple distribution! – manjula Jan 19 '18 at 23:26

1 Answers1

0

Updated answer, using a Constant distribution is not a good idea after all and it will be deprecated soon. So maybe a little trick could help, just use a distribution with mean someDervDist and small variance.

actual_visit_mod_90_days = [29, 76, 66, 66, 10, 63, 69, 75, 66, 87, 58]

with pm.Model() as pattern_study:

    dist1 = pm.Binomial('dist1', n=14, p=0.5)
    dist2 = pm.DiscreteUniform('dist2', lower = 0, upper = 89)
    pp = pm.DensityDist('pp', jeffreys_logp, testval = 0.5)
    dist3 = pm.Geometric('dist3', pp)

    someDervDist = pm.Deterministic('someDervDist', dist2-dist1+dist3-1)
    likelihoodDist2 =  pm.DiscreteUniform.dist(someDervDist-1, someDervDist + 1)
    likelihoodDist1 = pm.DiscreteUniform.dist(lower=0,upper=89)

    p= 0.05

    p_vd = pm.Mixture('p_vd', [p, 1-p],
                      [likelihoodDist1, likelihoodDist2],
                     observed = actual_visit_mod_90_days)
trace = pm.sample(1000, step=pm.Metropolis())

As an alternative, you can try with a continuous version of your model, something like this:

with pm.Model() as pattern_study:

    dist1 = pm.Normal('dist1', 7, 1.5)
    dist2 = pm.Uniform('dist2', lower = 0, upper = 89)
    dist3 = pm.Exponential('dist3', 1)

    someDervDist = pm.Deterministic('someDervDist', dist2-dist1+dist3-1)
    likelihoodDist2 =  pm.Normal.dist(someDervDist, 0.1)
    likelihoodDist1 = pm.Uniform.dist(lower=0,upper=89)

    p = 0.05

    p_vd = pm.Mixture('p_vd', [p, 1-p],
                      [likelihoodDist1, likelihoodDist2],
                      observed = actual_visit_mod_90_days)
   trace = pm.sample(1000)
aloctavodia
  • 2,040
  • 21
  • 28
  • I understand it was a typo in your answer p_vd = pm.Mixture('p_vd', [p, 1-p], [likelihoodDist1,likelihoodDist2], observed = actual_visit_mod_90_days) but this gives nan for p_vd – manjula Jan 20 '18 at 09:50
  • You are right, sorry. How are you sampling? Did you try changing the prior as I suggested? Could you post your data or at least part of it? – aloctavodia Jan 20 '18 at 13:08
  • Notice that I changed your definition of Jeffreys prior. The model runs with fake data `actual_visit_mod_90_days = np.random.randint(1, 15, 100)` and sampling with `pm.sample()` – aloctavodia Jan 20 '18 at 13:35
  • Thank you for your time. The problem is I need to stick to the equation for nonExceptinalVisitDate. Your Densitydist is thus wrong.[ log(x-y) != log (x) -log( y)] and that is the observed value. I wonder why is there no dist for Deterministic or Potential distributions. – manjula Jan 21 '18 at 00:21
  • It also would be great if you could point me to some example of Mixture that uses custom distribution and has the observed. – manjula Jan 21 '18 at 00:26
  • See my updated answer. Hopefully this is what you want. Neither `Potential` or `Deterministic` represent distributions, that why they do not have a `dist` method. The first one is just an arbitrary factor like a constraint and the second is just a non-random variable. I am sorry but I do not have any example at hand like the one you want. – aloctavodia Jan 21 '18 at 01:19
  • I tried two ways with constant.dist. 1. with my Jeffrey's prior (for me theta will be 1-theta of the webpage). then it runs fine but with userwarnings. 2. with the Beta distribution as suggested by you, even though it ran without any userwarnings, map_estimate = find_MAP(model = patten_study) print(map_estimate) gives logp = nan and ||grad|| = 0.5 The input observed values is : [29, 76, 66, 66, 10, 63, 69, 75, 66, 87, 58] – manjula Jan 21 '18 at 23:07
  • A constant dist is not a good idea after all, it "freezes" the sampling of all the parent distributions and also it will be deprecated soon. See my updated answer for a trick that you can use. – aloctavodia Jan 22 '18 at 17:40
  • Yes, I am working on your suggestions. I wonder, isn't it possible to call logp function of a distribution within definition of another distribution. Something like dist = Binomial (n=28, p=0.5) def some_logp (value): some_log_value = T.log(T.exp(dist.logp(value))-T.log(2)) return some_log_value. – manjula Jan 24 '18 at 19:55
  • use `pm.Binomial.dist(n=28, p=0.5).logp(value)` – aloctavodia Jan 25 '18 at 00:16
  • def logp_visit_date(value): log_value = T.log(T.exp(medsRunOut.logp(value)) - T.exp(patientVisitDecisionStarts.logp(value)) + T.exp(numDecisions.logp(value)) -1) return log_value nonVisitInterval = pm.DensityDist.dist(lambda x: logp_visit_date(x)) uniformVisitDist = DiscreteUniform.dist(lower=0, upper=89) def p_obs_vdd_logp(value): p = 0.15 return pm.logaddexp(T.log(p) + uniformVisitDist.logp(value),T.log(1-p) + nonVisitInterval.logp(value)) p_obs_vdd = pm.DensityDist('p_obs_vdd',p_obs_vdd_logp,observed = actual_visit_mod_90_days) it gvs p_obs_vdd nan – manjula Jan 25 '18 at 02:52
  • I am accepting the answer, although it was not the exact answer i was looking for. It helped to give up on Mixture and instead use a DensityDist with another equation. – manjula Feb 01 '18 at 03:10