Quantization aware (re)training a keras model

Question

I have a (trained) tf.keras model which I would like to convert to a quantized model and retrain with tensorflow's fake quant strategy (using python as frontend).

Could I somehow apply tf.contrib.quantize.create_training_graph directly to the keras model (graph) then retrain? Seems like there's some problem with the fact that the session is already created when taking the graph from K.get_session().graph.

For example, the following approach:

import tensorflow.contrib.lite as tflite
keras_graph = tf.keras.backend.get_session().graph

from tensorflow.contrib.quantize import create_training_graph
create_training_graph(input_graph=keras_graph,
                      quant_delay=int(0*(len(X_train) / batch_size)))
...
model.compile(...)
model.fit_generator(...)

results with the message: "Operation '{name:'act_softmax/sub' id:2294 op device:{} def:{{{node act_softmax/sub}} = Sub[T=DT_FLOAT](conv_preds/act_quant/FakeQuantWithMinMaxVars:0, act_softmax/Max)}}' was changed by updating input tensor after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session."

And true enough, the error: tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value conv_preds/act_quant/conv_preds/act_quant/max/biased

(i.e. create_training_graph needs the graph before the session was created? is it possible to get the graph from a keras model before the session was instantiated?)

Alternatively, if this doesn't work, could I convert the (h5) model to a checkpoint, then somehow load the model from this checkpoint to a tensorflow graph and continue working with pure tensorflow?

Would appreciate any help or pointers. Thank you!

I have opened a feature request as I have been unable to find a way to do this either https://github.com/tensorflow/tensorflow/issues/27880 — Ed Bordin, Apr 16 '19 at 02:07
@EdBordin, I have a feeling that the fake quant quantization approach used in this implementation is temporary anyway, i.e. using the exp average on the no-clipping value (during training). So something new is probably coming anyway, and Keras is the front-end of tensorflow 2... so we should see support of quantization aware training in Keras/tf2 soon enough. — Daugmented, Apr 16 '19 at 09:06

Quantization aware (re)training a keras model

0 Answers0