I see zero difference in PYMC3 speed when using GPU vs. CPU.
I am fitting a model that requires 500K+ samples to converge. Obviously it is very slow, so I tried to speed things up with GPU (using GPU instance on EC2). Theano reports to be using GPU, so I believe CUDA/Theano are configured correctly. However, I strongly suspect that Pymc3 is not utilising GPU.
- do I need to set my variables to TensorType(float32, scalar) explicitly? Currently, they are float64.
- Are only some samplers/likelihoods can benefit from CUDA? I am fitting Poisson-based model and so using Metropolis sampler, not NUTS
- is there a way to check that pymc3 is using GPU?