1.0 or with stop_gradient=False in paddlepaddle

Asked Aug 11 '22 at 15:28

Active Aug 14 '22 at 13:31

Viewed 52 times

If the answer is yes, and is this true that arbitrary ops/functions/methods could be backpropagated through? Furthermore,for example, what's the gradient of scalar_y w.r.t. tensor_x in below code:

step1: scalar_y = tf.size(tensor_x) OR scalar_y = paddle.fluid.layers.size(tensor_x)

step2: tensor_z = scalar_y * tensor_x

If the answer is no, and then under what circumstances or in which (types of) ops will this happens, and why?

One more concrete example(one implementation of DropBlock) : Line 34~37 in https://github.com/DHZS/tf-dropblock/blob/master/nets/dropblock.py

My question is whether it is necessary to wrap all the variables below in tf.stop_gradient:

output = inputs * mask * tf.to_float(tf.size(mask)) / tf.reduce_sum(mask)

Anybody help?

edited Aug 14 '22 at 13:31

asked Aug 11 '22 at 15:28

Victor Li

Will all the variables be updated (through backprop+optim) w/o using tf.stop_gradient in tensorflow2.0/1.0 or with stop_gradient=False in paddlepaddle

0 Answers0