1

So I'm at very beginner level of Machine Learning and I want to forecast multiple samples of time series. The time series contains samples at every 15 minutes and I have to forecast samples for next 3 days. So approximately 288 samples in future.

My time series have other categorical features also so I implemented one model based on this answer.

I read about encoder-decoder for seq2seq time series forecasting. But couldn't understand much regarding how to implement it and combine it with multiple categorical features.

  1. Am I going in the right direction by following that answer?
  2. Will LSTM work properly even for a large dimension of Y (in my case 288 time steps into future).
  3. I'm considering last 7 days samples as X so my input shape for lstm is (no of samples, 672, 1). Is that okay?
  4. Should I go for encoder-decoder? If yes then can anyone please provide me some more insight and maybe a good tutorial.

Thanks in advance.

Maharshi
  • 36
  • 6

1 Answers1

2
  1. Yes.
  2. Depends on how much data you have and how learnable your problem is.
  3. The more data you use the better.
  4. An encoder-decoder architecture is just a fancy name for 'feedforward your hidden LSTM states'. I don't see a reason why you would need to use it in your case.
KonstantinosKokos
  • 3,369
  • 1
  • 11
  • 21
  • 1
    Sounds like a plan. You might wanna check this out https://machinelearningmastery.com/multi-step-time-series-forecasting-long-short-term-memory-networks-python/ It's not anything groundbreaking but it might help unblock you if you're unsure what you're doing. – KonstantinosKokos Apr 09 '18 at 07:21
  • Thanks for the quick answer. I have past 1 year data available but right now I'm using only 1 month data for quick training to check results otherwise it takes too much time. I will use all data once the model is final. My data is in table format with 1 row for each timestep. Currently my input shape is (2190, 672, 1). What do you mean by more data? Should I increase no of samples i.e. 2190 or should I consider more data in past to predict next samples. i.e. 672. ? – Maharshi Apr 09 '18 at 07:24
  • I have already crawled through that site but still I'm not feeling much confident. Also can you suggest how much cell units I should use? As a rookie I believe that hidden cell units should be higher than Y dimension so it shout be greater then 288 and It should be in '2 power' so should I believe 512 units will work? Or anything less then that which is not in 2 power will work okay?? – Maharshi Apr 09 '18 at 07:27
  • 1
    Since you're talking about forecasting, I assume you have no more predictors to use other than your past values, so I don't see a way to increase the feature space. However if you assume that the pattern of change is temporally invariant (i.e. your data change with time only depending on the values from the last N timesteps) then you can still use your far past data to predict their future counterparts (i.e. use data from 12 months ago to predict their "future" values from 11 months ago). This will increase your number of samples. Alternatively you can increase the look-back span. – KonstantinosKokos Apr 09 '18 at 07:32
  • 1
    In any case there's no standard way that guarantees optimal results, so you must follow a trial and error approach and see what works best. – KonstantinosKokos Apr 09 '18 at 07:33