I have two vectors 'xp' and 'fp' which correspond to the x and y values respectively of the data. A third vector 'x' which is the x coordinates at which I would like to evaluate the interpolated values. My results in python using NumPy's interp function was as expected.
import numpy as np
xp = np.array([1.0, 1.5, 2.0, 2.5, 3.5, 4.0, 4.5, 7.0, 8.0, 9.0, 10.0, 14.0, 17.0, 18.0, 20.0])
yp = xp**2
x = np.array([3,5])
np.interp(x,xp,yp) #array([ 9.25, 26. ])
My question is how do I replicate this algorithm inside a cuda kernel?
Here is my attempt:
size --> len(xp) == len(fp), x_size --> len(x).
__global__ void lerp_kernel(double* xp, double* yp, double* output_result,
double* x, unsigned int size, unsigned int x_size)
{
for( unsigned int idx = blockIdx.x*blockDim.x + threadIdx.x ; idx < size ; idx +=
blockDim.x*gridDim.x )
{
if(idx > size - 2){
idx = size - 2;
}
double dx = xp[idx + 1] - xp[idx];
double dy = yp[idx + 1] - yp[idx];
double dydx = dy/dx ;
double lerp_result = yp[idx] + (dydx * (x[idx] - xp[idx]));
output_result[idx] = lerp_result;
}
}
I think one of the mistakes I am making is that I am not searching for the index range in xp that contains x (using something like numpy.searchsorted in python). I am not sure how to implement this part in CUDA. If someone knows a better way to do lerp in cuda, please let me know.
I have seen lerp functions in the documentation of Cg (https://developer.download.nvidia.com/cg/lerp.html), but these need a weight vector (fraction between 0-1) for the x vector. I am not sure how to rescale x so that I can use a weight vector to solve this problem.