0

I have a df in R with 6000 rows and 90 variables. I want to predict the sales volume of product A for the next 12 months based on data about product A as well as competing products (B,C,D,...). To use an LSTM, I need to reshape the df into 3D format (samples, timesteps, features) but I don't quite get how.

My df looks somewhat like this:

Date Product Sales X1 X2 X3 ... X87
2017-01-01 A 0.65438 0.45438 -1.2670 0.3215 ... 1.35623
2017-01-01 B -0.55468 0.12436 -1.5677 -0.3215 ... 1.35623
2017-01-01 C 0.65981 1.12345 -0.5574 0.3215 ... 1.35623
2017-02-01 A -0.12338 -1.12345 0.4543 -1.5673 ... 0.42961
-------- -------- -------- --------- -------- --------- ------ ---------
2022-12-01 C 0.34568 1.134598 0.5678 -1.2648 ... 0.34675

So far, I have split the data into train and test set and normalized. Then I ran:

# Reshape to 3-dimensional array
train_data_lstm <- train_data %>%
  as.matrix() %>% 
  array(dim = c(nrow(train_data), ncol(train_data), 1))

test_data_lstm <- test_data %>%
  as.matrix() %>% 
  array(dim = c(nrow(test_data), ncol(test_data), 1))

# Prepare sequences
lookback <- 12  # how many steps back the model should look
  1. How do I continue from here?
  2. Do I need to remove the Date variable? What about the product labels?

Thank you in advance!

Vitalizzare
  • 4,496
  • 7
  • 13
  • 32

0 Answers0