While reading the seq2seq paper (Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.), the authors said they used a 4-layer LSTMs in section 3.4 Training Details1. However, in section 3.6 Experimental Results2, they said they used an ensemble of 5 reversed LSTMs.
I'm pretty consfused about this, since I think an ensemble of 5 reversed LSTMs menas a 5-layer LSTMs, which conflicts with the section 3.4. I don't know whether this is a typo or I misundertand the meaning of the 5 reversed LSTMs.
[1]:
We used deep LSTMs with 4 layers, with 1000 cells at each layer and 1000 dimensional word embeddings, with an input vocabulary of 160,000 and an output vocabulary of 80,000.
I have Googled:
- "5 reversed LSTMs" meaning
- "ensemble 5 reversed LSTMs" meaning
- "Ensemble of 5 reversed LSTMs" 4 layers
and found nothing help, and the repos in Github don't have the relevant issues/questions.