3

I am trying to implement the Population Monte Carlo algorithm as described in this paper (see page 78 Fig.3) for a simple model (see function model()) with one parameter using Python. Unfortunately, the algorithm doesn't work and I can't figure out what's wrong. See my implementation below. The actual function is called abc(). All other functions can be seen as helper-functions and seem to work fine.

To check whether the algorithm workds, I first generate observed data with the only parameter of the model set to param = 8. Therefore, the posterior resulting from the ABC algorithm should be centered around 8. This is not the case and I'm wondering why.

I would appreciate any help or comments.

# imports

from math import exp
from math import log
from math import sqrt
import numpy as np
import random
from scipy.stats import norm


# globals
N = 300              # sample size
N_PARTICLE = 300      # number of particles
ITERS = 5            # number of decreasing thresholds
M = 10               # number of words to remember
MEAN = 7             # prior mean of parameter
SD = 2               # prior sd of parameter


def model(param):
  recall_prob_all = 1/(1 + np.exp(M - param))
  recall_prob_one_item = np.exp(np.log(recall_prob_all) / float(M))
  return sum([1 if random.random() < recall_prob_one_item else 0 for item in range(M)])

## example
print "Output of model function: \n" + str(model(10)) + "\n"

# generate data from model
def generate(param):
  out = np.empty(N)
  for i in range(N):
    out[i] = model(param)
  return out

## example

print "Output of generate function: \n" + str(generate(10)) + "\n"


# distance function (sum of squared error)
def distance(obsData,simData):
  out = 0.0
  for i in range(len(obsData)):
    out += (obsData[i] - simData[i]) * (obsData[i] - simData[i])
  return out

## example

print "Output of distance function: \n" + str(distance([1,2,3],[4,5,6])) + "\n"


# sample new particles based on weights
def sample(particles, weights):
  return np.random.choice(particles, 1, p=weights)

## example

print "Output of sample function: \n" + str(sample([1,2,3],[0.1,0.1,0.8])) + "\n"


# perturbance function
def perturb(variance):
  return np.random.normal(0,sqrt(variance),1)[0]

## example 

print "Output of perturb function: \n" + str(perturb(1)) + "\n"

# compute new weight
def computeWeight(prevWeights,prevParticles,prevVariance,currentParticle):
  denom = 0.0
  proposal = norm(currentParticle, sqrt(prevVariance))
  prior = norm(MEAN,SD)
  for i in range(len(prevParticles)):
    denom += prevWeights[i] * proposal.pdf(prevParticles[i])
  return prior.pdf(currentParticle)/denom


## example 

prevWeights = [0.2,0.3,0.5]
prevParticles = [1,2,3]
prevVariance = 1
currentParticle = 2.5
print "Output of computeWeight function: \n" + str(computeWeight(prevWeights,prevParticles,prevVariance,currentParticle)) + "\n"


# normalize weights
def normalize(weights):
  return weights/np.sum(weights)


## example 

print "Output of normalize function: \n" + str(normalize([3.,5.,9.])) + "\n"


# sampling from prior distribution
def rprior():
  return np.random.normal(MEAN,SD,1)[0]

## example 

print "Output of rprior function: \n" + str(rprior()) + "\n"


# ABC using Population Monte Carlo sampling
def abc(obsData,eps):
  draw = 0
  Distance = 1e9
  variance = np.empty(ITERS)
  simData = np.empty(N)
  particles = np.empty([ITERS,N_PARTICLE])
  weights = np.empty([ITERS,N_PARTICLE])

  for t in range(ITERS):
    if t == 0:
      for i in range(N_PARTICLE):
        while(Distance > eps[t]):
          draw = rprior()
          simData = generate(draw)
          Distance = distance(obsData,simData)

        Distance = 1e9
        particles[t][i] = draw
        weights[t][i] = 1./N_PARTICLE

      variance[t] = 2 * np.var(particles[t])
      continue


    for i in range(N_PARTICLE):
      while(Distance > eps[t]):
        draw = sample(particles[t-1],weights[t-1])
        draw += perturb(variance[t-1])
        simData = generate(draw)
        Distance = distance(obsData,simData)

      Distance = 1e9
      particles[t][i] = draw
      weights[t][i] = computeWeight(weights[t-1],particles[t-1],variance[t-1],particles[t][i])


    weights[t] = normalize(weights[t])  
    variance[t] = 2 * np.var(particles[t])

  return particles[ITERS-1]



true_param = 9
obsData = generate(true_param)
eps = [15000,10000,8000,6000,3000]
posterior = abc(obsData,eps)
#print posterior
beginneR
  • 3,207
  • 5
  • 30
  • 52
  • 1
    It wouldn't be easy for a person not familiar with the workings of MC to go through your code. Could you verify which part of your code *do* work as intended? It's also not entirely clear which article you are referring to and which parts of it. Maybe you could cross-reference it in the comments in the code. – aldanor May 28 '16 at 08:57
  • Thanks for your comment. I added the link to the paper which presents the pseudo code for the algorithm. – beginneR May 28 '16 at 09:14
  • The paper is not freely available by the way. – ayhan May 28 '16 at 09:17
  • @ayhan Sorry, I must have been logged in in some way. The algorithm is also presented here (see bottom of page 5). https://arxiv.org/pdf/0805.2256v9.pdf – beginneR May 28 '16 at 09:22

1 Answers1

3

I stumbled upon this question as I was looking for pythonic implementations of PMC algorithms, since, quite coincidentally, I'm currently in the process of applying the techniques in this exact paper to my own research.

Can you post the results you're getting? My guess is that 1) you're using a poor choice of distance function (and/or similarity thresholds), or 2) you're not using enough particles. I may be wrong here (I'm not very well-versed in sample statistics), but your distance function implicitly suggests to me that the ordering of your random draws matters. I'd have to think about this more to determine whether it actually has any effect on the convergence properties (it may not), but why don't you simply use the mean or median as your sample statistic?

I ran your code with 1000 particles and a true parameter value of 8, while using the absolute difference between sample means as my distance function, for three iterations with epsilons of [0.5, 0.3, 0.1]; the peak of my estimated posterior distribution seems to be approaching 8 just like it should on each iteration, alongside a reduction in the population variance. Note that there is still a noticeable rightward bias, but this is because of the asymmetry of your model (parameter values of 8 or less can never result in more than 8 observed successes, while all parameters values greater than 8 can, leading to a rightward skewedness in the distribution).

Here's the plot of my results:

Particle evolution over three iterations, converging to the true value of 8, and demonstrating a slight asymmetry in the tendency towards higher parameter values.

rrauenza
  • 6,285
  • 4
  • 32
  • 57
jbb
  • 76
  • 3
  • You're right, thanks! The reason was the distance function. A few days before you wrote this, I changed to a simple 'absolute difference in means' -distance function as well and it worked. – beginneR Jul 04 '16 at 09:36
  • Currently, I'm struggling with the calculation of the weights when dealing with multiple parameters. For me it would make sense to discard or accept sets of parameters. So instead of picking single values of the parameters based on their weights, I'm wondering if instead there is one weight per set of parameters. Do you know which version is the correct one? – beginneR Jul 04 '16 at 09:39
  • I believe we're sampling from the full joint posterior distribution of the parameters, so each particle has one sampled value for each individual parameter, and one importance sampling weight. For each of the N particles: 1) a particle is resampled from the previous iteration, using the previous iteration's particle's weights; 2) each parameter value in each randomly sampled particle is perturbed using the proposal distribution, e.g. the multivariate normal; 3) step 2 is repeated for each new particle until the new epsilon is met; and 4) the new weights are computed and normalized. – jbb Jul 05 '16 at 16:14
  • Thank you! But say you want to estimate mean and variance of a normal distribution. What is a proper joint prior distribution? The joint prior has to be specified in order to compute the weight of the particle. – beginneR Jul 06 '16 at 07:34
  • Another problem is that you need to compute the variance of the previous particles for the computation of a new weight. But if a particle is a list/tuple containing sampled values for each parameters, how do you then compute the variance of particles? – beginneR Jul 06 '16 at 07:51
  • I've honestly struggled to find this unambiguously explained in the literature, specifically the detailed implementation of PMC in ABC applications. I've found [this](http://abcpmc.readthedocs.io/en/latest/) package by Joel Akaret to be particularly helpful though, and you might find it useful to go through his code. Regarding your priors, that's up to you to specify. If you assume independence of your parameters, you can just multiply your priors to get the joint prior. Perhaps try a normal prior for the mean and a Gamma prior for the (positive-definite) variance. – jbb Jul 07 '16 at 16:03