0

I have been using PyMC in an analysis of some high energy physics data. It has worked to perfection, the analysis is complete, and we are working on the paper.

I have a small problem, however. I ran the sampler with the RAM database backend. The traces have been sitting around in memory in an IPython kernel process for a couple of months now. The problem is that the workstation support staff want to perform a kernel upgrade and reboot that workstation. This will cause me to lose the traces. I would like to keep these traces (as opposed to just generating new), since they are what I've made all the plots with. I'd also like to include a portion of the traces (only the parameters of interest) as supplemental material with the publication.

Is it possible to take an existing chain in a pymc.MCMC object created with the RAM backend, change to a different backend, and write out the traces in the chain?

jsw
  • 145
  • 1
  • 9

1 Answers1

1

The trace values are stored as NumPy arrays, so you can use numpy.savetxt to send the values of each parameter to a file. (This is what the text backend does under the hood.)

While saving your current traces is a good idea, I'd suggest taking the time to make your analysis repeatable before publishing.

Kyle Meyer
  • 1,546
  • 9
  • 10
  • Thanks. Certainly the analysis is repeatable, I just don't really want to have to re-do an overnight run (again) at this point. And the plots are already approved by the collaboration as is. If they change, even if the changes are invisible (which they would be), I'd have to go through the approval process again. So saving the existing traces is preferable. In a more general case, some calculations could be prohibitive to run again. Perhaps you get billed for CPU hours. It is good to be able to salvage a run from a mistake like this. – jsw Feb 20 '14 at 00:56