0

I'm trying to implement a very simple topic model in PyMC3 and I'm having a problem getting it to sample. I'm pretty sure the issue is in the 'p' function in how I'm trying to access the 'theta' variable. Theta isn't really a list, but I don't know how else to combine this information. Any help would be greatly appreciated.

import numpy as np
from pymc3 import *
import theano.tensor as t

K = 3 #NUMBER OF TOPICS
V = 20 #NUMBER OF WORDS
N = 15 #NUMBER OF DOCUMENTS


#GENERAETE RANDOM CATEGORICAL MIXTURES
data = np.ones([N,V])

@theano.compile.ops.as_op(itypes=[t.dmatrix, t.lscalar],otypes=[t.dvector])
def p(theta, z_i):
    return theta[z_i]


model = Model()
with model:

    alpha = np.ones(V)
    beta = np.ones(K)

    theta = Dirichlet('theta', alpha, shape=(K,V))
    phi = Dirichlet('phi', beta, shape=K)

    z = [Categorical('z_%i' % i, phi) for i in range(N)]

    x = [Categorical('x_%i' % (i), p=p(theta,z[i]), observed=data[i]) for i in range(N)]

    print "Created model.  Now begin sampling"
    step = Metropolis()
    trace = sample(10000, step)

    trace.get_values('phi')

The error message I'm getting in the above code is:

TypeError: expected type_num 9 (NPY_INT64) got 7 Apply node that caused the error: ScalarFromTensor(InplaceDimShuffle{}.0) Toposort index: 11 Inputs types: [TensorType(int64, scalar)] Inputs shapes: [()] Inputs strides: [()] Inputs values: [array(1)] Outputs clients: [[Subtensor{int64}(Elemwise{TrueDiv}[(0, 0)].0, ScalarFromTensor.0)]]

Chris
  • 676
  • 5
  • 20
  • Theano can index into a tensor, so I think your `p(theta, z_i)` function is entirely redundant. – bsmith89 Mar 23 '17 at 13:19
  • Also, your list-comprehensions will be much more difficult for theano to deal with than using a K dimensional Categorical distribution. I think that would look something like `z = Categorical('z', phi, shape=N)` and `x = Categorical('x', theta, observed=data)`. I haven't tested this, though. – bsmith89 Mar 23 '17 at 13:21

0 Answers0