0

What I am trying to achieve now is to create a custom loss function in Keras that takes in two tensors (y_true, y_pred) with shapes (None, None, None) and (None, None, 3), respectively. However, the None's are so, that the two shapes are always equal for every (y_true, y_pred). From these tensors I want to produce two distance matrices that contain the squared distances between every possible point pair (the third, length 3 dimension contains x, y, and z spatial values) inside them and then return the difference between these distance matrices. The first code I tried was this:

def distanceMatrixLoss1(y_true, y_pred):
    distMatrix1 = [[K.sum(K.square(y_true[i] - y_true[j])) for j in range(i + 1, y_true.shape[1])] for j in range(y_true.shape[1])]
    distMatrix2 = [[K.sum(K.square(y_pred[i] - y_pred[j])) for j in range(i + 1, y_pred.shape[1])] for j in range(y_pred.shape[1])]
    return K.mean(K.square(K.flatten(distMatrix1) - K.flatten(distMatrix2)))

(K is the TensorFlow backend.) Needless to say, I got the following error:

'NoneType' object cannot be interpreted as an integer

This is understandable, since range(None) does not make a lot of sense and y_true.shape[0] or y_pred.shape[0] is None. I searched whether others got somehow the same problem or not and I found that I could use the scan function of TensorFlow:

def distanceMatrixLoss2(y_true, y_pred):

    subtractYfromXi = lambda x, y: tf.scan(lambda xi: K.sum(K.square(xi - y)), x)
    distMatrix = lambda x, y: K.flatten(tf.scan(lambda yi: subtractYfromXi(x, yi), y))

    distMatrix1 = distMatrix(y_true, y_true)
    distMatrix2 = distMatrix(y_pred, y_pred)

    return K.mean(K.square(distMatrix1-distMatrix2))

What I got from this is a different error, that I do not fully understand.

TypeError: <lambda>() takes 1 positional argument but 2 were given

So this went into the trash too. My last try was using the backend's map_fn function:

def distanceMatrixLoss3(y_true, y_pred):

    subtractYfromXi = lambda x, y: K.map_fn(lambda xi: K.sum(K.square(xi - y)), x)
    distMatrix = lambda x, y: K.flatten(K.map_fn(lambda yi: subtractYfromXi(x, yi), y))

    distMatrix1 = distMatrix(y_true, y_true)
    distMatrix2 = distMatrix(y_pred, y_pred)

    return K.mean(K.square(distMatrix1-distMatrix2))

This did not throw an error, but when the training started the loss was constant 0 and stayed that way. So now I am out of ideas and I kindly ask you to help me untangle this problem. I have already tried to do the same in Mathematica and also failed (here is the link to the corresponding question, if it helps).

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
fazekaszs
  • 55
  • 6
  • What are the first two dimensions? Do you want to pair dimension 0 or dimension 1? – Daniel Möller Jan 30 '20 at 14:56
  • Off-topic: why is your model outputting `None` for the third dimension? Is it well defined? – Daniel Möller Jan 30 '20 at 14:58
  • The 0th dimension is the batch size, the 1st is the output of an lstm, that is why it's None, the 2nd is supposed to encode the coordinates already mentioned. What I would like to do is to iterate over every [x,y,z] point (so along axis 1, the second None) and pair it with every other point (again, along axis 1 in the same tensor). Then I want to produce the distance (or squared distance) between these point pairs and put it into a tensor (this is the pairwise distance matrix). I would do this to both y_pred and y_true and compare the two distance matrices, i.e. compute the MSD between them. – fazekaszs Jan 30 '20 at 15:13

1 Answers1

0

Assuming that dimension 0 is the batch size as usual and you don't want to mix samples.
Assuming that dimension 1 is the one you want to make pairs
Assuming that the last dimension is 3 for all cases although your model returns None.

Iterating tensors is a bad idea. It might be better just to make a 2D matrix from the original 1D, though having repeated values.

def distanceMatrix(true, pred): #shapes (None1, None2, 3)

    #------ creating the distance matrices 1D to 2D -- all vs all

    true1 = K.expand_dims(true, axis=1) #shapes (None1, 1, None2, 3)
    pred1 = K.expand_dims(pred, axis=1)

    true2 = K.expand_dims(true, axis=2) #shapes (None1, None2, 1, 3)
    pred2 = K.expand_dims(pred, axis=2) 

    trueMatrix = true1 - true2 #shapes (None1, None2, None2, 3)
    predMatrix = pred1 - pred2

    #--------- euclidean x, y, z distance

       #maybe needs a sqrt?
    trueMatrix = K.sum(K.square(trueMatrix), axis=-1) #shapes (None1, None2, None2)
    predMatrix = K.sum(K.square(predMatrix), axis=-1)


    #-------- loss for each pair

    loss = K.square(trueMatrix - predMatrix)  #shape (None1, None2, None2)

    #----------compensate the duplicated non-diagonals

    diagonal = K.eye(K.shape(true)[1])  #shape (None2, None2)
        #if Keras complains because the input is a tensor, use `tf.eye`

    diagonal = K.expand_dims(diagonal, axis=0) #shape (1, None2, None2)
    diagonal = 0.5 + (diagonal / 2.)

    loss = loss * diagonal

    #--------------

    return K.mean(loss, axis =[1,2])  #or just K.mean(loss) 
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • Thank you, this is awesome! And I realized I misunderstood your question and did not answer it: I don't know why y_true ha a shape of `(None, None, None)`. Honestly, I am new to Keras and I would assume that y_true shoud have the same shape automatically as y_pred. Here is how I defined my model: `model = Model(inputs=(inputPart1, inputPart2) , outputs=scaled)`. The `inputPart1` and 2 are two `Input` layers (w/ different shapes in the third dim.), while `scaled` is a `Lamdba` layer that multiplies the sequence output of an lstm. – fazekaszs Jan 31 '20 at 10:20