I am new to pymc3, but I've heard it can be used to build a Bayesian update model. So I tried, without success. My goal was to predict which day of the week a person buys a certain product, based on prior information from a number of customers, as well as that person's shopping history.
So let's suppose I know that customers in general buy this product only on Mondays, Tuesdays, Wednesdays, and Thursdays only; and that the number of customers who bought the product in the past on those days is 3,2, 1, and 1, respectively. I thought I would set up my model like this:
import pymc3 as pm
dow = ['m', 'tu', 'w','th']
c = np.array([3, 2, 1, 1])
# hyperparameters (initially all equal)
alphas = np.array([1, 1, 1, 1])
with pm.Model() as model:
# Parameters of the Multinomial are from a Dirichlet
parameters = pm.Dirichlet('parameters', a=alphas, shape=4)
# Observed data is from a Multinomial distribution
observed_data = pm.Multinomial(
'observed_data', n=7, p=parameters, shape=4, observed=c)
So this set up my model without any issues. Then I have an individual customer's data from 4 weeks: 1 means they bought the product, 0 means they didn't, for a given day of the week. I thought updating the model would be as simple as:
c = np.array([[1, 0,0,0],[0,1,0,0],[1,0,0,0],[1,0,0,0],[1,0,0,1])
with pm.Model() as model:
# Parameters are a dirichlet distribution
parameters = pm.Dirichlet('parameters', a=alphas, shape=4)
# Observed data is a multinomial distribution
observed_data = pm.Multinomial(
'observed_data',n=1,p=parameters, shape=4, observed=c)
trace = pm.sample(draws=100, chains=2, tune=50, discard_tuned_samples=True)
This didn't work.
My questions are:
- Does this still take into account the priors I set up before, or does it create a brand-new model?
- As written above, the code didn't work as it gave me a "bad initial energy" error. Through trial and error I found that parameter "n" has to be the sum of the elements in observations (so I can't have observations adding up to different n's). Why is that? Surely the situation I described above (where some weeks they shop only on Mondays, and others on Mondays and Thursday) is not impossible?
Is there a better way of using pymc3 or a different package for this type of problem? Thank you!