0

I have a NN that has two identical CNN (similar to Siamese network), then merges the outputs, and intends to apply a custom loss function on the merged output, something like this:

     -----------------        -----------------
     |    input_a    |        |    input_b    |
     -----------------        -----------------
     | base_network  |        | base_network  |
     ------------------------------------------
     |           processed_a_b                |
     ------------------------------------------

In my custom loss function, I need to break y vertically into two pieces, and then apply categorical cross entropy loss on each piece. However, I keep getting dtype errors from my loss function, e.g.:

ValueError Traceback (most recent call last) in () ----> 1 model.compile(loss=categorical_crossentropy_loss, optimizer=RMSprop())

/usr/local/lib/python3.5/dist-packages/keras/engine/training.py in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, **kwargs) 909 loss_weight = loss_weights_list[i] 910 output_loss = weighted_loss(y_true, y_pred, --> 911 sample_weight, mask) 912 if len(self.outputs) > 1: 913 self.metrics_tensors.append(output_loss)

/usr/local/lib/python3.5/dist-packages/keras/engine/training.py in weighted(y_true, y_pred, weights, mask) 451 # apply sample weighting 452 if weights is not None: --> 453 score_array *= weights 454 score_array /= K.mean(K.cast(K.not_equal(weights, 0), K.floatx())) 455 return K.mean(score_array)

/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py in binary_op_wrapper(x, y) 827 if not isinstance(y, sparse_tensor.SparseTensor): 828 try: --> 829 y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y") 830 except TypeError: 831 # If the RHS is not a tensor, it might be a tensor aware object

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, preferred_dtype) 674 name=name, 675 preferred_dtype=preferred_dtype, --> 676 as_ref=False) 677 678

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype) 739 740 if ret is None: --> 741 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 742 743 if ret is NotImplemented:

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref) 612 raise ValueError( 613 "Tensor conversion requested dtype %s for Tensor with dtype %s: %r" --> 614 % (dtype.name, t.dtype.name, str(t))) 615 return t 616

ValueError: Tensor conversion requested dtype float64 for Tensor with dtype float32: 'Tensor("processed_a_b_sample_weights_1:0", shape=(?,), dtype=float32)'

Here is a MWE to reproduce the error:

import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Dense, merge, Dropout
from keras.models import Model, Sequential
from keras.optimizers import RMSprop
import numpy as np

# define the inputs
input_dim = 10
input_a = Input(shape=(input_dim,), name='input_a')
input_b = Input(shape=(input_dim,), name='input_b')
# define base_network
n_class = 4
base_network = Sequential(name='base_network')
base_network.add(Dense(8, input_shape=(input_dim,), activation='relu'))
base_network.add(Dropout(0.1))
base_network.add(Dense(n_class, activation='relu'))
processed_a = base_network(input_a)
processed_b = base_network(input_b)
# merge left and right sections
processed_a_b = merge([processed_a, processed_b], mode='concat', concat_axis=1, name='processed_a_b')
# create the model
model = Model(inputs=[input_a, input_b], outputs=processed_a_b)

# custom loss function
def categorical_crossentropy_loss(y_true, y_pred):
    # break (un-merge) y_true and y_pred into two pieces
    y_true_a, y_true_b = tf.split(value=y_true, num_or_size_splits=2, axis=1)
    y_pred_a, y_pred_b = tf.split(value=y_pred, num_or_size_splits=2, axis=1)
    loss = K.categorical_crossentropy(output=y_pred_a, target=y_true_a) + K.categorical_crossentropy(output=y_pred_b, target=y_true_b) 
    return K.mean(loss)

# compile the model
model.compile(loss=categorical_crossentropy_loss, optimizer=RMSprop())
DarkCygnus
  • 7,420
  • 4
  • 36
  • 59
Salman
  • 1
  • 2
  • what line gives you the error? – DarkCygnus Jul 14 '17 at 20:55
  • 1
    Also, Hi and welcome to Stack Overflow, please take a time to go through the [welcome tour](https://stackoverflow.com/tour) to know your way around here (and also to earn your first badge), read how to create a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) and also check [How to Ask Good Questions](https://stackoverflow.com/help/how-to-ask) so you increase your chances to get feedback and useful answers. – DarkCygnus Jul 14 '17 at 20:55
  • @GrayCygnus the error comes from the compile line, where the loss function is used the first time. – Salman Jul 14 '17 at 20:58
  • ok I see, you are using tensorflow as backend based on your tag. Have you tried with theano instead? – DarkCygnus Jul 14 '17 at 21:01
  • coud you trace your error to what part of the *function* is the one giving value error? I suspect it is `K.categorical_crossentropy()` which is expecting float64 tensors, and you have float32. – DarkCygnus Jul 14 '17 at 21:14
  • 1
    @GrayCygnus Plz see the updated question. It has to do with weights! – Salman Jul 14 '17 at 21:21
  • it seems it comes from the `K.mean()` function call, could you save that to a variable and then return the variable to trace the error better? I mean `foo = K.mean(loss)` and `return foo`, you can comment what you get – DarkCygnus Jul 14 '17 at 21:27
  • wrote an answer based on what could be your problem, do let me know if it works or does not to follow up – DarkCygnus Jul 14 '17 at 22:22

1 Answers1

1

As your error indicates, you are working with float32 data and it expects float64. It is necessary to trace the error to its specific line to know for sure what tensor is to be corrected and to be able to help you better.

However, it seems to be related to K.mean() method, but ValueErrors can also be generated by the K.categorical_crossentropy() method. Therefore the problem could be with your tensors loss, both y_preds or both y_trues. Given these scenarios I see two things you could try to solve the problem:

  1. You can cast your tensor(s) (lets assume it is loss) to the desired (float64) type like this:

    from keras import backend as K
    new_tensor = K.cast(loss, dtype='float64')
    
  2. You can declare you inputs to be of type float64 at the beginning, by passing the parameter dtype to the Input() call (as suggested in these examples), like this:

    input_a = Input(shape=(input_dim,), name='input_a', dtype='float64')
    
DarkCygnus
  • 7,420
  • 4
  • 36
  • 59