Tensorflow model.trainable_variables doesn't update after setting layer.trainable

Question

Context

I'm creating a script which randomly modifies some parameters in a Tensorflow model. It aims to "encrypt" the model by recording the modifications made so that the modifications can be undone by authorised users only.

To enable this script, I want to freeze all model layers except the one I'm manually modifying.

Problem

To freeze all model layers, I set model.trainable = False. Then, I unfreeze a layer by setting layer.trainable = True. I want the layer's parameters to then be added to model.trainable_variables so that I can compute gradient updates like this:

with tf.GradientTape() as tape:
   pred = model(x)
   loss = tf.keras.losses.sparse_categorical_crossentropy(y, pred, from_logits=True)   
# returns None because model.trainable_variables is []
dloss_dparams = tape.gradient(loss, model.trainable_variables)

Reproducible Example

import tensorflow as tf 
import random

# Load pretrained model
model = tf.keras.applications.mobilenet_v2.MobileNetV2(
    input_shape=None,
    alpha=1.0,
    weights='imagenet',
    classifier_activation=None)

# No layers are trainable
model.trainable = False
print(model.trainable_variables)  # prints empty list: []

# Choose 5 random layers to train
selected_layers = [layer.name for layer in model.layers]
while len(selected_layers) > 5:
    rand_index = random.randint(0, len(selected_layers) - 1)
    del selected_layers[rand_index]

# Attempt to make them trainable
for layer in model.layers:
    if layer.name in selected_layers:
        layer.trainable = True

print(model.trainable_variables)  # STILL prints empty list: []

Attempted Solutions

I've checked out Tensorflow's guide on Transfer learning. Unlike me, they called model.compile() after they adjusted some layers' trainable attributes. But I'm not using an optimiser or evaluation metric; I just want to compute gradients and then manually update model parameters myself.

Some similar Tensorflow issues on StackOverflow are unanswered or based on bugs not applicable to my reproducible example.

Innat · Accepted Answer · 2023-05-03T21:40:33.830

Update

OP said in the comment,

Before this is left for future readers as an accepted solution, I think it's reasonable to ask for more details on WHY this alternative works compared to the original method.

I did some shallow test and didn't inspect the source code of the APIs. I've noticed some unexpected behaviour of the trianable attributes, might be a possible bugs. I've opened a ticket, here. In case you're interested, please see the attached gist file there, which contains comments and code.

You can try freezing the layers in the following way. The effect of model.trainable and the following way seems different, not sure if it's a bug or expected, will check and update here.

model = tf.keras.applications.mobilenet_v2.MobileNetV2(
    input_shape=None,
    alpha=1.0,
    weights='imagenet',
    classifier_activation=None
)

for layer in model.layers:
    layer.trainable = False

# Choose 5 random layers to train
selected_layers = [layer.name for layer in model.layers]
selected_layers = random.choices(selected_layers, k=5)

# Attempt to make them trainable
for layer in model.layers:
    if layer.name in selected_layers:
        layer.trainable = True

# check
for layer in model.layers:
    if layer.name in selected_layers and layer.trainable:
        print(
            layer.name, layer.output_shape, layer.trainable
        )

# not empty.
# you can do simply, model.compile()
print(model.trainable_variables)

ChatGPT also suggested this workaround (setting `layers.trainable = False` for all layers and then setting `layer.trainable = True` for the layers I randomly select. **Before this is left for future readers as an accepted solution, I think it's reasonable to ask for more details on WHY this alternative works compared to the original method.** — Madhav Malhotra, May 03 '23 at 20:11