The whole point of the optimization procedure is to update variables that the loss function depends on in a way to reduce the value of the loss.
Neitherloss
nor any variable that loss
depends on does not depend on var
(in other words var
is not used in any computation that leads to the calculation of loss
), therefore the gradient of loss
with respect to var
is not defined and var
is not updated by the optimizer back_prop
.
~~~
Post-edit answer: If you have a collection of variables param_list
and you want to calculate L2 regularization term over all those variables you can do it the following way:
# get the list of variables to calculate L2 loss over
param_list = tf.get_collection('var_params')
# a list of L2 regularization terms for each variable in param_list
L2_lst = [tf.nn.l2_loss(param) for param in param_list]
# Total L2 loss
L2_loss = tf.add_n(L2_lst)
One would multiply L2_loss by a regularization parameter lambda before adding it to the main loss:
L2_loss = tf.mulpiply(lamda_param, L2_lst)
loss = tf.add(loss1, L2_loss)