I'm using google's seq2seq library and based on some computations that I do between the predictions and the input I would like to zero out some of the losses for certain time steps (not for padding).
What I do is basically to go through each batch then through each time step of the decoder (logits) and for eatc time step I add "a zero or a one" to a list (based on my computation). This list should then be converted to a tensor and multiplied by the losses.
My problem is the shape of the tensor reurned by the sparse_softmax_cross_entropy_with_logits is variable, its not always the shape of the target tensor. So there is a mismatch of dimensions. Has anyone does something like this before and can share it, or know why this happens.