1

Here I would like to generate a tutorial usage of LSTM in MxNet, with the example for Tensorflow. (location at https://github.com/mouradmourafiq/tensorflow-lstm-regression/blob/master/lstm_sin.ipynb" Here is my major code

import mxnet as mx
import numpy as np
import pandas as pd
import argparse
import os
import sys
from data_processing import generate_data
import logging
head = '%(asctime)-15s %(message)s'
logging.basicConfig(level=logging.DEBUG, format=head)
TIMESTEPS = 3
BATCH_SIZE = 100
X, y = generate_data(np.sin, np.linspace(0, 100, 10000), TIMESTEPS, seperate=False)
train_iter = mx.io.NDArrayIter(X['train'], y['train'], batch_size=BATCH_SIZE, shuffle=True, label_name='lro_label')
eval_iter = mx.io.NDArrayIter(X['val'], y['val'], batch_size=BATCH_SIZE, shuffle=False)
test_iter = mx.io.NDArrayIter(X['test'], batch_size=BATCH_SIZE, shuffle=False)
num_layers = 3
num_hidden = 50

data = mx.sym.Variable('data')
label = mx.sym.Variable('lro_label')

stack = mx.rnn.SequentialRNNCell()
for i in range(num_layers):
    stack.add(mx.rnn.LSTMCell(num_hidden=num_hidden, prefix='lstm_l%d_'%i))
#stack.reset()
outputs, states = stack.unroll(length=TIMESTEPS,
                               inputs=data,
                               layout='NTC',
                               merge_outputs=True)

outputs = mx.sym.reshape(outputs, shape=(BATCH_SIZE, -1))
# purpose of fc1 was to make shape change to (batch_size, *), or label shape won't match LSTM unrolled output shape.
outputs = mx.sym.FullyConnected(data=outputs, num_hidden=1, name='fc1')
label = mx.sym.reshape(label, shape=(-1,))
outputs = mx.sym.LinearRegressionOutput(data=outputs, 
                               label=label,
                               name='lro')
contexts = mx.cpu(0)
model = mx.mod.Module(symbol = outputs,
                     data_names = ['data'],
                     label_names = ['lro_label'])
model.fit(train_iter, eval_iter,
         optimizer_params = {'learning_rate':0.005},
         num_epoch=4,
         batch_end_callback=mx.callback.Speedometer(BATCH_SIZE, 2))

This code runs but the train_accuracy is Nan. The question is how to make it correct? And since unrolled out shape has sequence_length, how can it match to label shape? Did my FC1 net make sense?

user2189731
  • 558
  • 8
  • 15
  • The code has no problem in general and I've finally made it work. It runs times slower than TF though. Not sure why. I may post an example to MXNet tutorial as a beginning example for LSTM since I found MXNet examples are very complex in general. – user2189731 Jan 02 '18 at 02:11
  • This runs fine for me and converges superfast. – Emil Feb 13 '20 at 21:08

1 Answers1

1

pass auto_reset=False to Speedometer callback, say, batch_end_callback=mx.callback.Speedometer(BATCH_SIZE, 2, auto_reset=False), should fix the NaN train-acc.

Yizhi
  • 21
  • 2