When implementing the ReLU function for AutoDiff, one of the methods used is the std::max
function; other implementations (conditional statements) work correctly but a try to implement max functions returns only 0 in the whole range.
On input vector:
dual in[] = { -3.0, -1.5, -0.1, 0.1, 1.5, 3.0 }
the derivative call in the form
derivative(ReLU,wrt(y),at(y))
where y = in[i]
gives proper results if ReLU is implemented with:
dual ReLU_ol(dual x) {
return (x > 0) * x; // ok; autodiff gives x > 0 ? 1 : 0
}
dual ReLU_if(dual x) {
if (x > 0.0) {
return x;
} else {
return 0.0;
}
// ok; autodiff gives x > 0 ? 1 : 0
}
that is (regarding derivative) one if x > 0 and zero elsewhere.
When ReLU is implemented in the form:
dual ReLU_max(dual x) {
return std::max(0.0,(double)x); // results an erroneous result of derivative
}
as result, I get zero in the whole range.
I expect that std::max
(or std::min
) should be prepared correctly for automatic differentiation.
Am I doing something wrong or maybe I miss understand something?
where d / dx is calculated with AutoDiff; purple and blue lines are overlapping; results for ReLU_ol
are not plotted.