I have been working on a time series analysis using a LSTM model implemented via TensorFlow 2.0. Now, it is clear that this is accomplished by creating windows of data. For instance, the first 30 values of the time series is a window, i.e. the input and the next value will be the target.
I came across the following function for creating these windows,
def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
series = tf.expand_dims(series, axis=-1)
ds = tf.data.Dataset.from_tensor_slices(series)
ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda w: w.batch(window_size + 1))
ds = ds.shuffle(shuffle_buffer)
ds = ds.map(lambda w: (w[:-1], w[1:]))
return ds.batch(batch_size).prefetch(1)
This returns a tf.data.Dataset
object when a series is passed in along with the other arguments like so,
window_size = 30
batch_size = 32
shuffle_buffer_size = 1000
series_dataset = windowed_dataset(series_train, window_size, batch_size=128, shuffle_buffer=shuffle_buffer_size)
On examination of this object, I found that each element is a batch of 128 windows and each window contains 30 values (as defined by the arguments passed).
This is all well and good, but what confuses me is that after defining a model like so,
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters=32, kernel_size=3,
strides=1, padding="causal",
activation="relu",
input_shape=[None, 1]),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(lambda x: x * 200)
])
optimizer = tf.keras.optimizers.SGD(lr=1e-7, momentum=0.9)
model.compile(loss=tf.keras.losses.Huber(),
optimizer=optimizer,
metrics=["mae"])
this dataset itself can be passed in to the model.fit()
method,
history = model.fit(dataset,epochs=500)
How is it possible that the target does not need to be defined here? You would normally need to do something like model.fit(x=inputs, y=targets, epochs=num_epochs)
. How come this is not necessary?