In Trax doc for trax.layers.attention.ShiftRight(https://trax-ml.readthedocs.io/en/latest/trax.layers.html?highlight=tl.ShiftRight#trax.layers.attention.ShiftRight), it says "Applies only if layer is created in a non-'eval' mode."
There are three modes: 'train', 'eval', and 'predict.' In my understanding, tl.ShiftRight does a job of inserting a zero to indicate this is the beginning of a sentence. I have no idea why it does not apply to 'eval' mode.
I guess in every mode, it is necessary to indicate the start of the sentence by inserting a zero token.