Training with Keras/TensorFlow in fp16 / half-precision for RTX cards

Question

I just got an RTX 2070 Super and I'd like to try out half precision training using Keras with TensorFlow back end.

So far I have found articles like this one that suggest using this settings:

import keras.backend as K

dtype='float16'
K.set_floatx(dtype)

# default is 1e-7 which is too small for float16.  Without adjusting the epsilon, we will get NaN predictions because of divide by zero problems
K.set_epsilon(1e-4)

The network is a simple 4 layer CNN for audio classification.

My input data is a NumPy 3D array generated previously (audio MFCC features extracted with LibROSA). This data was generated using the CPU and I understand that the values are saved as 32bit float.

When I try to train my net with this data I get the following error:

TypeError: Tensors in list passed to 'inputs' of 'Merge' Op have types [float16, float32] that don't all match.

On a different article I read that I should also "Cast back to FP32 before SoftMax layer", what makes things even more confussing...

I would really appreciate some orientation.

Thanks!

score 0 · Answer 1 · answered Feb 06 '20 at 19:26

It is difficult to know the reason for the dtype mismatch without knowing the model architecture. But, I think that it has a BatchNorm layer before the Merge.

In that case, the reason for both the merge and the softmax recommendation would be the same, that during operations which involve computing statistics (mean/variance), it is preferred to use float32. This is because with float16, the precision errors might be too large and would give inaccurate results, especially during divisions.

I haven't tried it, but in Keras(2.2.5 atleast) BatchNormalization layer, if using Tensorflow as the backend, the variance is converted to float32.

   if K.backend() != 'cntk':
        sample_size = K.prod([K.shape(inputs)[axis]
                              for axis in reduction_axes])
        sample_size = K.cast(sample_size, dtype=K.dtype(inputs))
        if K.backend() == 'tensorflow' and sample_size.dtype != 'float32':
            sample_size = K.cast(sample_size, dtype='float32')

        # sample variance - unbiased estimator of population variance
        variance *= sample_size / (sample_size - (1.0 + self.epsilon))

Maybe the resulting tensor after the normalization is not converted back to float16 and leads to the error. To solve it, you could remove the BatchNorm to confirm, and then modify your local copy of keras or implement a custom BatchNorm which converts back to 'float16' after normalization.

Training with Keras/TensorFlow in fp16 / half-precision for RTX cards

1 Answers1