0

I am sub-classing tensorflow.keras.Model to implement a certain model. Expected behavior:

  1. Training (fitting) time: returns a list of tensors including the final output and auxiliary output;
  2. Inferring (predicting) time: returns a single output tensor.

And the code is:

class SomeModel(tensorflow.keras.Model):
    # ......
    def call(self, x, training=True):
        # ......
        return [aux1, aux2, net] if training else net

This is how i use it:

model=SomeModel(...)
model.compile(...,
    loss=keras.losses.SparseCategoricalCrossentropy(),
    loss_weights=[0.4, 0.4, 1],...)
# ......
model.fit(data, [labels, labels, labels])

And got:

AssertionError: in converted code:

ipython-input-33-862e679ab098:140 call *

   `return [aux1, aux2, net] if training else net`

...\tensorflow_core\python\autograph\operators\control_flow.py:918 if_stmt

Then the problem is that the if statement is converted into the calculation graph and this would of course cause the problem. I found the whole stack trace is long and useless so it's not included here.

So, is there any way to make TensorFlow generate different graph based on training or not?

Meow Cat 2012
  • 928
  • 1
  • 9
  • 21

1 Answers1

0

Which tensorflow version are you using? You can overwrite behaviour in the .fit, .predict and .evaluate methods in Tensorflow 2.2, which would generate different graphs for these methods (I assume) and potentially work for your use-case.

The problems with earlier versions is that subclassed models get created by tracing the call method. This means Python conditionals become Tensorflow conditionals and face several limitations during graph creation and execution.
First, both branches (if-else) have to be defined, and regarding python collections (eg. lists), the branches have to have the same structure (eg. number of elements). You can read about the limitations and effects of Autograph here and here.

(Also, a conditional may not get evaluated at every run, if the condition is based on a Python variable and not a tensor.)

Andrea Angeli
  • 745
  • 1
  • 5
  • 14
  • Thanks. Ended up making the custom model not extending `keras.Model` and returns a `keras.Model` functional once called. Maybe sub-classing `keras.layers.Layer` is better, i've not tried. – Meow Cat 2012 May 16 '20 at 02:48
  • I think subclassing `keras.layers.Layer` would not solve the issue, since the same behaviour happens (the `call` method gets traced by Autograph). The `training` parameter should (I think) change only the behaviour of a layer and not the structure/type/etc. of it's output. Maybe I would try multiple output heads (Functional API), and when training is done, define a new model with prediction head only and copy the weights. – Andrea Angeli May 16 '20 at 06:38
  • Well, `Layer` has a extra `build` method so i thought it could be different. Btw you don't need to copy weights if the input shape of the created `Layer`'s have not changed. Once I re-wired a trained demo model and accuracy did not drop, which verified this fact. – Meow Cat 2012 May 16 '20 at 10:18