Unable to understand char-lstm example in MXNET for Julia

Question

I am trying to understand the char lstm example mentioned here - char-lstm julia example

Function lstm_cell accepts the second parameter as previous state -
function lstm_cell(data::mx.SymbolicNode, prev_state::LSTMState, param::LSTMParam;num_hidden::Int=512, dropout::Real=0, name::Symbol=gensym())
However, in the section - #stack LSTM cells

next_state = lstm_cell(hidden, l_state, l_param, num_hidden=dim_hidden, dropout=dp,name=Symbol(name, "lstm$t"))
hidden = next_state.h
layer_param_states[i] = (l_param, next_state)

layer_param_states[i] gets updated with the next state- layer_param_states[i] = (l_param, next_state)
why is this done here. Why is the previous state being updated with the next state.

Please use backticks (`) to quote your code. (Or highlight the piece of code and click on the "{}" button.) — David P. Sanders, Feb 19 '17 at 19:08

score 1 · Answer 1 · answered Jan 11 '18 at 00:01

Because layer_param_states stores the final states of the sequence. Note in https://github.com/dmlc/MXNet.jl/blob/master/examples/char-lstm/lstm.jl#L110 the final state is grouped and will be used to make loss with provided labels.

Just FYI, the python example does exactly the same thing: https://github.com/apache/incubator-mxnet/blob/master/example/rnn/old/lstm.py#L167 . The name last_states makes more sense.

Unable to understand char-lstm example in MXNET for Julia

1 Answers1