1

I want to create a custom metric for pearson correlation as defined here

I'm not sure how exactly to apply it to batches of y_pred and y_true

What I did:

def pearson_correlation_f(y_true, y_pred):

    y_true,_ = tf.split(y_true[:,1:],2,axis=1)
    y_pred, _ = tf.split(y_pred[:,1:], 2, axis=1)

    fsp = y_pred - K.mean(y_pred,axis=-1,keepdims=True)
    fst = y_true - K.mean(y_true,axis=-1, keepdims=True)

    corr = K.mean((K.sum((fsp)*(fst),axis=-1))) / K.mean((
      K.sqrt(K.sum(K.square(y_pred - 
      K.mean(y_pred,axis=-1,keepdims=True)),axis=-1) * 
      K.sum(K.square(y_true - K.mean(y_true,axis=-1,keepdims=True)),axis=-1))))

return corr

Is it necessary for me to use keepdims and handle the batch dimension manually and the take the mean over it? Or does Keras somehow do this automatically?

user3142067
  • 1,222
  • 3
  • 13
  • 26

1 Answers1

4

When you use K.mean without an axis, Keras automatically calculates the mean for the entire batch.

And the backend already has standard deviation functions, so it might be cleaner (and perhaps faster) to use them.

If your true data is shaped like (BatchSize,1), I'd say keep_dims is unnecessary. Otherwise I'm not sure and it would be good to test the results.

(I don't understand why you use split, but it seems also unnecessary).

So, I'd try something like this:

fsp = y_pred - K.mean(y_pred) #being K.mean a scalar here, it will be automatically subtracted from all elements in y_pred
fst = y_true - K.mean(y_true)

devP = K.std(y_pred)
devT = K.std(y_true)

return K.mean(fsp*fst)/(devP*devT)

If it's relevant to have the loss for each feature instead of putting them all in the same group:

#original shapes: (batch, 10)

fsp = y_pred - K.mean(y_pred,axis=0) #you take the mean over the batch, keeping the features separate.   
fst = y_true - K.mean(y_true,axis=0) 
    #mean shape: (1,10)
    #fst shape keeps (batch,10)

devP = K.std(y_pred,axis=0)  
devt = K.std(y_true,axis=0)
    #dev shape: (1,10)

return K.sum(K.mean(fsp*fst,axis=0)/(devP*devT))
    #mean shape: (1,10), making all tensors in the expression be (1,10). 
    #sum is only necessary because we need a single loss value

Summing the result of the ten features or taking a mean of them is the same, being one 10 times the other (That is not very relevant to keras models, affecting only the learning rate, but many optimizers quickly find their way around this).

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • I am trying to do the same thing, but in my case I am predicting a single output. I get 'nan' when I use either of the methods above when I try to run model.evaluate(). Can you give me any hints on how to debug this? I can get custom metrics like this to work just fine: def my_mse(ytrue, y_pred): return K.mean(K.square(y_pred-y_true), axis =1). I tried putting axis=1 above and still got 'nan'. I also added check for zero denominator and return 0 if so to eliminate that possibility. Any help much appreciated. – Zak Keirn Jun 23 '18 at 17:51
  • If you have only one sample, very probably you don't have any deviation or statistics. – Daniel Möller Jun 23 '18 at 18:26
  • Thanks. I just checked the output of model.fit and I can see all the correlations being calculated and they look correct. So evidently the model.evaluate() only does one sample at a time as you suggest. – Zak Keirn Jun 23 '18 at 18:28
  • Usual reasons are invalid data or invalid results in layers. But there are many possibilities. Sometimes, for instance, you get a line full of zeros somewhere, or some impossible division / root, etc. – Daniel Möller Jun 23 '18 at 18:31
  • Based on description of Keras evaluate, it says it does it in default batches of 32 which should have worked? My test vector is much longer. So I am confused about why it seems to work in model.fit() but not in model.evaluate(). – Zak Keirn Jun 23 '18 at 18:41
  • For some reason, model.evaluate() produces 'nan' using the default batch_size of 32, if I set to 50 or 100 or even the entire length of the test data, it works fine. Thanks for your code, it works great. – Zak Keirn Jun 23 '18 at 19:15
  • You got a full zero batch, very probably.... and some operation does not support full zeros there. – Daniel Möller Jun 23 '18 at 19:25
  • It doesn't matter whether "fit" works, unless you say it's exactly the "same data". In which case, "fit" might still be shuffling it and creating valid batches by this. – Daniel Möller Jun 23 '18 at 19:26
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173687/discussion-between-zak-keirn-and-daniel-moller). – Zak Keirn Jun 23 '18 at 20:08