In this blog on Recurrent Neural Networks by Denny Britz.
Author states that, "The above diagram has outputs at each time step, but depending on the task this may not be necessary. For example, when predicting the sentiment of a sentence we may only care about the final output, not the sentiment after each word. Similarly, we may not need inputs at each time step."
In the case when we take output only at the final timestep: How will backpropogation change, if there are no outputs at each time step, only the final one? We need to define loss at each time step, but how to do it without outputs?