PyMC3 has excellent functionality for dealing with Bayesian regressions, so I've been trying to leverage that to run a Bayesian Gamma Regression using PyMC3 where the likelihood would be Gamma.
From what I understand, running any sort of Bayesian Regression in PyMC3 requires the pymc3.glm.GLM()
function, which takes in a model formula in Patsy form (e.g. y ~ x_1 + x_2 + ... + x_m), the dataframe, and a distribution.
However, the issue is that the pymc3.glm.GLM()
function requires a pymc3..families
object (https://github.com/pymc-devs/pymc3/blob/master/pymc3/glm/families.py) for the distribution. But the Gamma distribution doesn't show up as one of the families built into the package so I'm stuck. Or is the Gamma function family hidden somewhere? Would appreciate any help in this matter!
For context:
I have a dataframe of features [x_1, x_2, ..., x_m]
(call it X
) and a target variable (call it y
). This is the code I have prepared so far, but just need to figure out how to get the Gamma distribution in as my likelihood.
import pymc3 as pm
# Combine X and y into a single dataframe
patsy_DF = X
patsy_DF['y'] = y
# Get Patsy Formula
all_columns = "+".join(X.columns)
patsy_formula = "y~" + all_columns
# Instantiate model
model = pm.Model()
# Construct Model
with model:
# Fit Bayesian Gamma Regression
pm.glm.GLM(patsy_formula, df_dummied, family=pm.families.Gamma())
# !!! ... but pm.families.Gamma() doesn't exist ... !!!
# Get MAP Estimate and Trace
map_estimate = pm.find_MAP(model=model)
trace = pm.sample(draws=2000, chains=3, start = map_estimate)
# Get regression results summary (coefficient estimates,
pm.summary(trace).round(3)