-1

I am trying to build an LSTM model for Multistep prediction. My data is a time series of parking occupancy rate sampled each five minutes (I have 25 weeks of samples). I started creating the code like below :

import numpy as np
training_data_len = int(np.ceil( len(data) * .90 ))
train_data = data.iloc[0:int(training_data_len), :]
print(len(train_data))
# Create the testing data set

test_data = data.iloc[training_data_len: , :] # - timestep
print(len(test_data))

data_train = np.array(train_data)
def split_sequence(sequence, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out
        # check if we are beyond the sequence
        if out_end_ix > len(sequence):
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

X_train, y_train = [], []
X_train, y_train = split_sequence(data_train,6,6)

reg = Sequential()
reg.add(LSTM(units = 200,return_sequences=True, input_shape=(1,1)))#, return_sequences=True , activation = 'relu' 
reg.add(Dropout(0.2))

reg.add(LSTM(units = 200,return_sequences=True)) #, activation = 'relu'
reg.add(Dropout(0.2))
reg.add(LSTM(units = 200,return_sequences=True)) #, activation = 'relu'
reg.add(Dropout(0.2))

reg.add(Dense(6,))
#here we have considered loss as mean square error and optimizer as adam

reg.compile(loss='mse', optimizer='adam')
#training the model
#,validation_split=0.1,
#    shuffle=False
reg.fit(X_train, y_train, epochs = 10,verbose=1)

data_test = np.array(test_data)
#here we are splitting the data weekly wise(7days)

X_test, y_test = split_sequence(data_test,6,6)

y_pred = reg.predict(X_test)

My goal is to predict using 30 minutes in the past(6 samples =30 minutes) next 30 minutes(6 samples =30 minutes).

I'm new with these kind of models and I wanna know if i'm working good or there is something that i'm missing or some improves. Thank you

  • What is the question? (Edit the title) What is the issue? If it is working, then (forgive me if I'm wrong) but I believe Stackoverflow may not be the exact place to ask, but I'll try to help anyway, because I'm a rebel! :) – Florian Fasmeyer Jun 25 '21 at 13:47
  • Sorry, i'm trying to understand if my approach is right for multistep task – Luca Minutillo Jun 25 '21 at 14:07

1 Answers1

0

Question: Is there an issue with my approach?

Usually, you may want to try out multiple models and multiple hyper-parameters. If it's a toy project, you should at least try out multiple models. Make sure you understand how each model works before setting parameters.

You may want to have more data in than out. Get 1h in and predict 10 min out.

You may want to do some data analysis before running any code, to get some insight about what might work. Make it visual, create graphics like PCA (may not work well with time series).

Talking about models: You can replace your LSTM with a Transformer. It can retain more information for longer. It's a new type of model that is better in every way to LSTMs.

If you have questions about data science or machine learning you should try the datascience.StackExchange instead of StackOverflow. Here we are supposed to help with quick, snappy responses about code. ;)

Florian Fasmeyer
  • 795
  • 5
  • 18