I'm trying to create a (C++) helper function that transforms a lookup table that's computed in my C++ code into a Halide Func which takes a float as an argument and lerps between samples in the LUT.
The use-case here is that the user has generated a tone curve using a spline with a bunch of control points that we want to apply with Halide. So I sample this spline at a bunch of points, and I want to write a Halide function that lets me linearly interpolate between these samples. Here's my current attempt at this:
#include <Halide.h>
using namespace Halide;
using namespace std;
static Func LutFunc(const vector<float> &lut) {
Func result;
Var val;
// Copy the LUT into a Halide Buffer.
int lutSize = (int) lut.size();
Buffer<float> lutbuf(lutSize);
for(int i = 0; i < lut.size(); i++) {
lutbuf(i) = lut[i];
}
// Compute the offset into the LUT along with the blending factor
// for the 2 samples we'll take.
auto y = val * ((float) lutSize - 1);
auto index = clamp(cast<int>(y), 0, lutSize - 2);
auto fract = y - cast<float>(index);
// interpolate between the 2 nearest samples
result(val) = (lutbuf(index) * (1.0f - fract)) + (lutbuf(index + 1) * fract);
return result;
}
The problem is that if I then try to include this function into my Halide pipeline, I get this error:
Implicit cast from float32 to int in argument 1 in call to "f" is not allowed. Use an explicit cast.
How can I explain to Halide that the argument to this function should be a float, rather than an int?
Here's a short test program for the above, in case it's helpful:
int main(int argc, char *argv[]) {
vector<float> lut = { 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 };
auto f = LutFunc(lut);
Buffer<float> testData(4);
testData(0) = 0.05;
testData(1) = 0.15;
testData(2) = 0.25;
testData(3) = 0.35;
Func testFn;
Var x;
testFn(x) = f(testData(x));
Buffer<float> resultBuf(4);
testFn.realize(resultBuf);
for(int i = 0; i < 4; i++) {
cout << i << " = " << resultBuf(i) << endl;
}
return 0;
}
(if there's an easier way to generate these lerping LUT functions (esp if it's able to take advantage of the sampler hardware on GPUs), I'd be interested to know about that too)