0

I am trying to run a hierarchical model with pymc3 in a Win10 environment using Spyder. I have some global model parameters (theta, omega, sigma) and one specific parameter (Ci).

It takes a pd Dataframe as input that contains the relevant data. First column is called 'Cohort', second column is called 'Period', third column contains the observations.

The number of observations differs between cohorts.

The code looks like this:

import pymc3 as pm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import theano.tensor as tt 

cohorts_idx, cohorts = pd.factorize(inputs['Cohort'], sort = True)
periods_idx, periods = pd.factorize(inputs['Period'], sort = True)

coords = {
    "cohort": cohorts,
    "period": periods,
    "collections": np.arange(len(cohorts_idx))
    }

with pm.Model(coords = coords) as model:
    
    # global model parameters
    omega = pm.HalfNormal("omega", sigma = 3)
    theta = pm.HalfNormal("theta", sigma = 5)
    sigma = pm.HalfNormal("sigma", sigma = 20)
            
    # cohort specific parameter
    Ci = pm.TruncatedNormal("Ci", mu = 60, sigma = 10, lower = 10, upper = 110, dims = "cohort")
    

    mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))
    sigma_i_t = sigma * mu_i_t ** 0.5
    
    _ = pm.Normal("Collections_i_t",
                                mu = mu_i_t,
                                sigma = sigma_i_t,            
                                observed = inputs['Collections'],
                                dims = "collections")
         
    results = pm.sample(draws = 1000, tune = 1000, cores = 8)
 
    return pm.summary(results)

The resulting error message is:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 139, in 
index_arithmetic_method
result = op(Series(self), other)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\common.py", line 64, in new_method
return method(self, other)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py", line 505, in wrapper
return _construct_result(left, result, index=left.index, name=res_name)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py", line 478, in _ 
construct_result
out = left._constructor(result, index=index)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\series.py", line 279, in __init__
data = com.maybe_iterable_to_list(data)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\common.py", line 280, in 
maybe_iterable_to_list
return list(obj)

File "C:\Users\alexi\Anaconda3\lib\site-packages\theano\tensor\var.py", line 640, in __iter__
for i in xrange(theano.tensor.basic.get_vector_length(self)):

File "C:\Users\alexi\Anaconda3\lib\site-packages\theano\tensor\basic.py", line 4828, in 
get_vector_length
raise ValueError("length not known: %s" % msg)

ValueError: length not known: Elemwise{true_div,no_inplace} [id A] ''   
|TensorConstant{[ 1  2  3 ..  1  2  1]} [id B]
|InplaceDimShuffle{x} [id C] ''   
|ViewOp [id D] 'theta'   
 |Elemwise{exp,no_inplace} [id E] ''   
   |theta_log__ [id F]

I have no clue why. Note that if instead of dividing by Theta in the line that causes an error, I do an addition instead, then it works (obviously it is not what I want though).

How can I resolve this and get this division working?

merv
  • 67,214
  • 13
  • 180
  • 245

1 Answers1

0

OK, I found out. I am not good enough to explain why, but the following works. One needs to replace line:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))

by:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx].to_numpy() / theta) ** omega))