Problem Summary
I have been optimizing my function VectorizedVcdfe
, and I am still trying to optimize it. This function is responsible for 99% of the slowness of another function customFunc
. This customFunc
is used in a PyMC3 code block.
Please help me optimize VectorizedVcdfe
.
Function to optimize
def VectorizedVcdfe(self, x, dataVector, recip_h_times_lambda_vector):
n = len(dataVector)
differenceVector = x - dataVector
stackedDiffVecAndRecipVec = pymc3.math.stack(differenceVector, recip_h_times_lambda_vector)
erfcTerm = 1. - pymc3.math.erf(self.neg_sqrt1_2 * pymc3.math.prod(stackedDiffVecAndRecipVec, axis=0))
# Calc F_Hat
F_Hat = (1. / float(n)) * pymc3.math.sum(0.5 * erfcTerm)
# Return F_Hat
return(F_Hat)
Arguments/variables
x
is a TensorVariable.
dataVector
is a 1Xn numpy matrix.
recip_h_times_lambda_vector
is also a 1Xn numpy matrix.
neg_sqrt1_2
is a scalar constant.
How customFunc
is used
with pymc3.Model() as model:
# Create likelihood
like = pymc3.DensityDist('X', customFunc, shape=2)
# Make samples
step = pymc3.NUTS()
trace = pymc3.sample(2000, tune=1000, init=None, step=step, cores=2)
EDIT:
To answer commenters, random values are OK for both dataVector
and
recip_h_times_lambda_vector
for the purposes of doing this optimization. In reality, recip_h_times_lambda_vector
is dependent on dataVector
and a scalar parameter h
.
Some commenters were wondering about customFunc
, so here it is...
def customFunc(X):
Y = []
for j in range(2):
x_j = X[j]
F_x_j = fittedKdEstimator.VCDFE(x_j)
y_j = myPPF(F_x_j)
Y.append(y_j)
logLikelihood = 0.
recipSqrtTwoPi = 1. / math.sqrt(2. * math.pi)
for j in range(2):
y_j = Y[j]
logLikelihood += pymc3.math.log(recipSqrtTwoPi * pymc3.math.exp(y_j * y_j / -2.))
return(pymc3.math.exp(logLikelihood))
The global variable fittedKdEstimator
is an instance of the class that contains the functions VectorizedVcdfe
and VCDFE
.
Here is the Python code for VCDFE
...
def VCDFE(self, x):
if not self.beenFit: raise Exception("Must first fit to data")
return(self.VectorizedVcdfe(x, self.__dataVector, self.__recip_h_times_lambda_vector))
On a separate note, the function myPPF
is my implementation of the standard normal "percent-point function" (AKA: "quantile function"). I have timed the customFunc
, and myPPF
takes a fraction of the entire time. The vast majority of time is consumed by VectorizedVcdfe
.
Last but not least, a typical value for n
may range from 10,000 to 100,000.