Behavior of Derivatives and propagate_adjoints in Halide: it works in a capricious and unpredictable

Question

There are two problems.

First problem. The code below gives wrong answers. Rosenbrock function is causing problems again. What is more important - if we define function as a linear f(t1, t2) = 2 * x(t1, t2) + y(t1, t2), it will work correctly. Why does the function affect on the correctness of answers?

Second problem. Now let's check commented lines. Replace our f with the lines above gives us a mistake Integer constant 1 will be implicitly coerced to type (void *), but Halide does not support pointer arithmetic. I see no differences between these two variants. Why is only one of them throwing a mistake?

#include <Halide.h>

#include <algorithm>
#include <iostream>
#include <stdio.h>
#include <typeinfo>

#include <string>
#include <vector>

namespace hld = Halide;
using namespace std;

int main(int argc, char **argv) {
    hld::Var t1("t1");
    hld::Var t2("t2");

    hld::Func x("x");
    hld::Func y("y");
    x(t1, t2) = t1;
    y(t1, t2) = t2; 

    hld::Func f("rosenbrock");  // f(x, y) = (1 - x) ** 2 + 100(y - x ** 2) ** 2
    hld::Func fir("fir");
    hld::Func sec("sec");
/*
    fir(t1, t2) = hld::pow(1 - x(t1, t2), 2);
    sec(t1, t2) = hld::pow(y(t1, t2) - hld::pow(x(t1, t2), 2), 2);
    f(t1, t2) = fir(t1, t2) + 100 * sec(t1, t2);
*/
    f(t1, t2) = hld::pow(1 - x(t1, t2), 2) + 100 * hld::pow(y(t1, t2) - hld::pow(x(t1, t2), 2), 2);

    hld::Func adjoint("adjoint");
    adjoint(t1, t2) = 1;
    hld::Derivative dfd = hld::propagate_adjoints(f, adjoint, {{0, 2}, {0, 4}});

    hld::Func dfdx("dfdx");
    dfdx(t1, t2) = dfd(x)(t1, t2);

    hld::Buffer<int> buf;
    buf = dfdx.realize(2, 4);
    for (int i = 0; i < 2; ++i) {
        for (int j = 0; j < 4; ++j) {
            cout << buf(i, j) << ' ' << i << ' ' << j << endl;
        }
    }

    return 0;
}

f = (1 - x)^2 + 100(y - x^2)^2

df/dx = 2(200x^3 - 200xy + x - 1)

-2 0 0
0 0 1
0 0 2
0 0 3
0 1 0
-2 1 1
-400 1 2
0 1 3

This is result of my program's work, but I expected: 

-2 0 0
-2 0 1
-2 0 2
-2 0 3
400 1 0
...

As you can see - absolutely different answers

llvm8.0

commit 6e99decbf0876bcfbf856d86afe6d76fe4019a7e

I have always thought that adjoint is for weights in derivative. But what is the real purpose of it?

Behavior of Derivatives and propagate_adjoints in Halide: it works in a capricious and unpredictable

0 Answers0