I have problems with NUTS sampling in hierarchical model in pymc3. I've done some simpler models in pymc3 without any problems, but with this assignment I can't move. Can someone help me please?
Data can be obtained from here: cars2004.csv. First some work with dataset should be done:
import numpy as np
import pandas as pd
import pymc3 as pm
import theano.tensor as T
data = pd.read_csv("cars2004.csv", sep=",", index_col= 0)
interest = ["price.retail", "cons.city", "cons.highway", "engine.size",
"horsepower","weight", "wheel.base", "length", "width", "ncylinder"]
df1 = data[data.ncylinder != -1]
df1 = df1[df1.fhybrid != "Yes"]
df = df1[interest].copy()
n, p = len(df), 9
df["price.retail"] = df["price.retail"].apply(lambda x: x/1000)
Model in python:
with pm.Model() as model:
lambd = pm.Gamma("lambd", alpha=1, beta=.005)
beta0 = pm.Normal("beta0", mu = 80, sd = 100)
beta = pm.MvNormal("beta", mu = np.zeros(p), tau = lambd*np.eye(p), shape = p)
mu = pm.MvNormal("mu", mu = np.array([10,10,3,200,1500,250,500,10,5]),
cov = np.diag([100,100,100,1000**2,1000**2,100**2,100**2,100**2,100]),
shape=p)
sigmainvert = pm.WishartBartlett("sigmainvert", nu=p,S=.001*np.eye(p),
testval=.001*np.eye(p))
tau = pm.Gamma("tau", 1, .005)
#sigma = pm.Uniform("sigma", .1, 100)
X = pm.MvNormal("X", mu = mu, tau = sigmainvert, shape = (n,p),
observed = df[interest[1:]])
Y = pm.Normal("Y", mu = beta0 + T.dot(X, beta), tau = tau,
observed = df["price.retail"])
dev = pm.Deterministic("dev",-2*Y.logpt)
The values in prior distributions are obtained from assignment. "X" have some missing data and "sigmainvert" should be Wishart(9, diag(0.001,...,0.001).
Application of ADVI for NUTS steps and starting values (first I had some problems with NaN elbo and NaN values, now elbo = -inf). Sampling:
with model:
means, sds, elbo = pm.variational.advi(n=50000, learning_rate=0.1)
step = pm.NUTS(scaling = model.dict_to_array(sds)**2, is_cov=True)
trace = pm.sample(1000, step, start = means)
plot1 = pm.traceplot(trace, varnames=["beta0","beta","tau","mu","dev"],alpha=1)
The result plot of traces: traceplot. As you can see, no MCMC inference was done, traces are constant as starting values. Sampling the same model with Metropolis step makes "mu" traces constant and other somehow OK. Where is the problem?