How to train a hierarchical model in two parts

Question

This is a follow up to the following question: Confused about how to implement time-distributed LSTM + LSTM

The current draft structure that is working well:

The basic idea is that there is a TimeDistributed deep LSTM input layer that works on each epoch of raw time series data and outputs a vector of features for each output. Then, the "outer" deep LSTM layer takes 7 of those sequential outputs and tries to classify the center epoch (assumed that 1 epoch does not have enough information to be classified by itself, and needs surrounding epochs). I say this is a draft because I haven't yet explored the feature space required for this to work well on many subjects.

There are several issues that still need to be resolved, but the one that I haven't found any clear-cut examples of online are trying to train this model in two parts: 1) the TimeDistributed later and 2) the "outer" layer. The reason being is that as I increase the number of epochs needed to classify (currently 7, but I expect it may get up to 21 or higher) more duplicated data is loaded, and the training speed is decreasing quickly.

One may propose an autoencoder for the first layer. However, I don't think this is the best solution. The reason I think so is that the features necessary to reproduce the input might very well be different than the features necessary to be used with other epochs to classify said layer. To expand: this is probable because the time series is semi-periodic, with most of the epoch providing little information other than the current period from important feature to important feature (and the number and location of these important features varies in each epoch).

How to train a hierarchical model in two parts

0 Answers0