Tensorflow eager execution gets no gradients if loss not directly computed using model output

Question

I'm not sure if this has been asked before but I am encountering a problem computing gradients with a custom loss function. I'm not quite sure how to title the question but it seems that, unless I use the model output directly in the loss computation, I get the following error:

 Error in py_call_impl(callable, dots$args, dots$keywords) : 
    ValueError: No gradients provided for any variable: [ list of model variables ]

My failing loss computation is:

    : :
    out <- as.matrix(mdl_output$numpy())
    act <- matrix(rep(1:nrow(out)), ncol=ncol(out), nrow=nrow(out), byrow=TRUE)
    rnk <- tf$Variable(rowApply(out, Rank), dtype=tf$float32)
    return(
        tf$losses$mean_squared_error(
            labels= act,
            predictions=rnk
        )
    )

The grad_tape$gradient(mdl_loss, mdl$variables) function returns an empty list. However, the following works:

    : :
    out <- as.matrix(mdl_output$numpy())
    act <- matrix(rep(1:nrow(out)), ncol=ncol(out), nrow=nrow(out), byrow=TRUE)
    rnk <- tf$Variable(rowApply(out, Rank), dtype=tf$float32)
    prd <- tf$add(tf$subtract(mdl_output, mdl_output), rnk)
    return(
        tf$losses$mean_squared_error(
            labels= act,
            predictions=prd
        )
    )

Note that prd was built using mdl_output whereas rnk was made using an R matrix which came from a numpy array.

Why do I need to do the latter? What is being passed by prd that is somehow getting lost by rnk?

FWIW: I am using the R tensorflow and keras packages with eager execution..

Tensorflow eager execution gets no gradients if loss not directly computed using model output

0 Answers0