I'm tried to solve timeseries prediction. Where my input is multivariate. My input has 4 variable, and my target is another variable.
I've processed the data as following.
4 variables and 60 timesteps input sequence.
So, each input shape is (1, 240)
.
I'll try to predict the next n-steps future output. During training, it will be 60 steps.
So, the target shape is (1,60)
Here is my LSTMPredictor
class.
class LSTMPredictor(nn.Module):
def __init__(self,n_feature, n_hidden=51):
super(LSTMPredictor, self).__init__()
self.n_hidden = n_hidden
# lstm1, lstm2, linear
self.lstm1 = nn.LSTMCell(n_feature, self.n_hidden)
self.lstm2 = nn.LSTMCell(self.n_hidden, self.n_hidden)
self.lstm3 = nn.LSTMCell(self.n_hidden, self.n_hidden)
self.linear = nn.Linear(self.n_hidden, 1)
def forward(self, x, future=0):
outputs = []
# lstm1
h_t = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
c_t = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
# lstm2
h_t2 = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
c_t2 = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
# lstm3
h_t3 = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
c_t3 = torch.zeros(1, self.n_hidden, dtype=torch.float32).cuda()
h_t, c_t = self.lstm1(x, (h_t, c_t))
h_t2, c_t2 = self.lstm2(h_t, (h_t2, c_t2))
output = None
for i in range(future):
if i == 0:
# first prediction
output = self.linear(h_t3) # h_t3?
outputs.append(output)
continue
h_t3, c_t3 = self.lstm3(h_t3, (h_t3, c_t3))
output = self.linear(h_t3)
outputs.append(output)
output = torch.cat(outputs, dim=1)
return output
Here, lstm1
and lstm2
receives the input with shape (1, 240)
, and then lstm3
is used to generate prediction to the future n steps successively. During training it is 60 steps.
However, my model is facing exploding gradient in the first step.
Model Initialization is shown bellow:
n_hidden = 512
n_feature = 240
model = LSTMPredictor(n_feature, n_hidden).to(device)
criterion = nn.MSELoss().to(device)
optimizer = optim.LBFGS(model.parameters(), lr=0.8)
Training Loop:
n_steps = 1
losses = []
print("--- Training Start ---")
for i in tqdm(range(n_steps)):
print("Step", i)
for i, sample_i in enumerate(train_input):
def closure():
optimizer.zero_grad()
out = model(sample_i.cuda(),future=60)
loss = criterion(out[0], train_target[i].cuda())
losses.append(loss.item())
loss.backward()
return loss
optimizer.step(closure)
print("loss", losses[-1])
Is there anything wrong in my implementation?