3 questions:
what is grad_outputs in chainer?
one example in chainer's function F.transpose, how to explain this backward code?
def backward(self, inputs, grad_outputs): gy = grad_outputs[0] inv_axes = self.axes if self.axes: axes = tuple(ax % len(self.axes) for ax in self.axes) inv_axes = tuple(numpy.argsort(axes)) gx = gy.transpose(inv_axes) return gx,
suppose I want implement self define function, but my inputs[0] and inputs[1] have different shape, in order to back propagation using differential chain rule, I have to write following code in
backward
:a, b = inputs gy = grad_outputs[0] return a * gy, b * gy But, a and b is not same shape, and
a * gy
andb * gy
maybe report error? shape doesn't match to multiply?