0

I want to fit a series of Bayesian models where I have a unique prior. Each model will basically have its own vector of observations. For instance, let's say I want to find the conversion rate for each product in a website. I could use the website's conversion rate as a common prior for all items, and then update my likelihood based on the observed data for each items. So for instance, to come up with the posterior distribution for the conversion rate of Item A I will use all the data I've collected so far for Item A and the common prior (website conversion rate).

This is some code I borrowed from Twiecki's Europy presentation:

np.random.seed(9)
algo_a = sp.stats.bernoulli(.5).rvs(300) # 50% profitable days
algo_b = sp.stats.bernoulli(.6).rvs(300) # 60% profitable days

import pymc as pm
model = pm.Model()
with model: # model specifications in PyMC3 are wrapped in a with-statement
    # Define random variables
    theta_a = pm.Beta('theta_a', alpha=5, beta=5) # prior
    theta_b = pm.Beta('theta_b', alpha=5, beta=5) # prior

    # Define how data relates to unknown causes
    data_a = pm.Bernoulli('observed A',
                          p=theta_a, 
                          observed=algo_a)

    data_b = pm.Bernoulli('observed B', 
                          p=theta_b, 
                          observed=algo_b)

    # Inference!
    start = pm.find_MAP() # Find good starting point
    step = pm.Slice() # Instantiate MCMC sampling algorithm
    trace = pm.sample(10000, step, start=start, progressbar=False) # draw posterior samples using slice sampling 

What he is doing is to infer which sample perform better in an A/B test. This is kind of a similar problem from the one I have, with the only difference that I'm not comparing baseline and experiment, but items in a website.

This works great if we have only two samples, the baseline and the experiment. But what could we have done if for instance we had 100 experiments running at the same time? Is there a way to pass to the likelihood distribution (in this case Bernoulli) an array of observations? In this case it would be a 300x100 array, where 300 are the binary observations for each sample and 100 the number of samples.

Gianluca
  • 6,307
  • 19
  • 44
  • 65

0 Answers0