1

I'm a beginner in Halide and tried to use the compute_with() directive but got an error. I've reduced the program to minimal size:

#include "Halide.h"

namespace {

using namespace Halide;

Var x("x"), y("y"), c("c");

class Harris : public Halide::Generator<Harris> {
public:
    Input<Buffer<int32_t, 3>> input{ "input" };
    Output<Buffer<int32_t, 3>> output{"output"};

    void generate() {

        // Algorithm
        Func A("A"), B("B");

        A(x) = input(x, 0, 0);
        B(x) = input(x, 0, 0);

        output(x, y, c) = A(x) + B(x);

        // Schedule
        B.compute_root();
        A.compute_with(B, x);
        output.compute_root();
    }
};

}  // namespace

HALIDE_REGISTER_GENERATOR(Harris, harris)

I get the build error: Invalid compute_with: A.s0 is scheduled inline.

The actual program, which this is a reduction of, has, instead of output(x) = A(x) + B(x); a long series of computations on A and separately on B and the results of each are then combined in a way that requires equal size domains for each. These long computations have multiple compute_root() in them. That's why I did a B.compute_root() instead of making it evaluate in output's loop. I'm trying to make A and B be evaluated in a single loop, which is why I tried compute_with().

Please help. I've been stuck on this for many days.

1 Answers1

0

compute_with is not a substitute for also doing compute_root or compute_at. Try A.compute_root().compute_with(B, x)

Andrew Adams
  • 1,396
  • 7
  • 3
  • Thank you! That worked. I'd been stuck on it forever. I am a little puzzled by why it doesnt make it significantly faster though - just 3% faster. From earlier experience with Halide, doing one loop through the data instead of two should have made it almost twice as fast. – user3184007 Aug 12 '23 at 06:07