0

I am using a MCMC code (Monte-Carlo Markov-Chain) to test a model and see if it is validated.

I launch this MCMC with MPI for Python using 64 processors.

This MCMC is called automecee : https://johannesbuchner.github.io/autoemcee/index.html

Once convergence is reached, I have the following error that seems to come from mpi4py :

enter image description here

I particular, the line of error is line 363 in automecee, here this part :

if converged:
    # finally, gelman-rubin diagnostic on chains
    chains = np.asarray([sampler.get_chain(flat=True) for sampler in self.samplers])
    if self.use_mpi:
        recv_chains = self.comm.gather(chains, root=0)  <-- Line 363
        chains = np.concatenate(self.comm.bcast(recv_chains, root=0))

    assert chains.shape == (num_chains, num_steps * num_walkers, self.x_dim), (chains.shape, (num_chains, num_steps * num_walkers, self.x_dim))

    rhat = arviz.rhat(arviz.convert_to_dataset(chains)).x.data
    if self.log:
        self.logger.info("rhat chain diagnostic: %s (<%.3f is good)", rhat, rhat_max)
    converged = np.all(rhat < rhat_max)

    if self.use_mpi:
        converged = self.comm.bcast(converged, root=0)

I don't understand why the self.comm.gather(chains, root=0) raises an error. Moreover, the SystemError indicates that a Negative size is passed to PyBytes_FromStringAndSize : I don't understand what it is means.

If someone has had this kind of error with autoemceee and got to circumvent this issue.

Matt Pitkin
  • 3,989
  • 1
  • 18
  • 32
guizmo133
  • 11
  • 3
  • Along with this question, it's probably worth creating an issue on the package github page here https://github.com/JohannesBuchner/autoemcee/issues and asking the question their. – Matt Pitkin Aug 02 '23 at 14:18
  • @MattPitkin . Thanks, I have just created an issue on : https://github.com/JohannesBuchner/autoemcee/issues/2 – guizmo133 Aug 14 '23 at 08:15

0 Answers0