2

I'm getting acquainted with LSTMs and I need clarity on something. I'm modeling a time series using t-300:t-1 to predict t:t+60. My first approach was to set up an LSTM like this:

# fake dataset to put words into code:
X = [[1,2...299,300],[2,3,...300,301],...]
y = [[301,302...359,360],[302,303...360,361],...]

# LSTM requires (num_samples, timesteps, num_features)
X = X.reshape(X.shape[0],1,X.shape[1])

model = Sequential()
model.add(LSTM(n_neurons[0], batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(y.shape[1]))
model.compile(loss='mse', optimizer='adam')

model.fit(X, y, epochs=1, batch_size=1, verbose=1, shuffle=False)

With my real dataset, the results have been suboptimal, and on CPU it was able to train 1 epoch of around 400,000 samples in 20 minutes. The network converged quickly after a single epoch, and for any set of points I fed it, the same results would come out.

My latest change has been to reshape X in the following way:

X = X.reshape(X.shape[0],X.shape[1],1)

Training seems to be going slower (I have not tried on the full dataset), but it is noticably slower. It takes about 5 minutes to train over a single epoch of 2,800 samples. I toyed around with a smaller subset of my real data and a smaller number of epochs and it seems to be promising. I am not getting the same output for different inputs.

Can anyone help me understand what is happening here?

user4446237
  • 636
  • 8
  • 21

1 Answers1

2

In Keras, timesteps in (num_samples, timesteps, num_features) determine how many steps BPTT will propagate the error back.

This, in turn, takes more time to do hence the slow down that you are observing.

X.reshape(X.shape[0], X.shape[1], 1) is the right thing to do in your case, since what you have is a single feature, with 300 timesteps.

vishnu viswanath
  • 3,794
  • 2
  • 36
  • 47
  • I have my Dataset with dimensions of (82 x 15 x 2). My input window size is 15 and output window size is 12. I am moving the input with lag of one. Can you explain a bit what (num_samples, timesteps, num_features) in this case? – Nomiluks Jun 28 '18 at 11:58