9

I'm trying to use tf.contrib.seq2seq module to do forecasting on some data (just float32 vectors) but all the examples I found using the seq2seq module from TensorFlow are used for translation and therefore embeddings.

I'm struggling to understand exactly what tf.contrib.seq2seq.Helper is doing for the Seq2Seq architecture and how I can use the CustomHelper in my case.

This is what I've done for now:

import tensorflow as tf 
from tensorflow.python.layers import core as layers_core

input_seq_len = 15 # Sequence length as input
input_dim = 1 # Nb of features in input

output_seq_len = forecast_len = 20 # horizon length for forecasting
output_dim = 1 # nb of features to forecast


encoder_units = 200 # nb of units in each cell for the encoder
decoder_units = 200 # nb of units in each cell for the decoder

attention_units = 100

batch_size = 8


graph = tf.Graph()
with graph.as_default():

    learning_ = tf.placeholder(tf.float32)

    with tf.variable_scope('Seq2Seq'):

        # Placeholder for encoder input
        enc_input = tf.placeholder(tf.float32, [None, input_seq_len, input_dim])

        # Placeholder for decoder output - Targets
        target = tf.placeholder(tf.float32, [None, output_seq_len, output_dim])


        ### BUILD THE ENCODER

        # Build RNN cell
        encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(encoder_units)

        initial_state = encoder_cell.zero_state(batch_size, dtype=tf.float32)

        # Run Dynamic RNN
        #   encoder_outputs: [batch_size, seq_size, num_units]
        #   encoder_state: [batch_size, num_units]
        encoder_outputs, encoder_state = tf.nn.dynamic_rnn(encoder_cell, enc_input, initial_state=initial_state)

        ## Attention layer

        attention_mechanism_bahdanau = tf.contrib.seq2seq.BahdanauAttention(
            num_units = attention_units, # depth of query mechanism
            memory = encoder_outputs, # hidden states to attend (output of RNN)
            normalize=False, # normalize energy term
            name='BahdanauAttention')

        attention_mechanism_luong = tf.contrib.seq2seq.LuongAttention(
            num_units = encoder_units,
            memory = encoder_outputs,
            scale=False,
            name='LuongAttention'
        )


        ### BUILD THE DECODER

        # Simple Dense layer to project from rnn_dim to the desired output_dim
        projection = layers_core.Dense(output_dim, use_bias=True, name="output_projection")

        helper = tf.contrib.seq2seq.TrainingHelper(target, sequence_length=[output_seq_len for _ in range(batch_size)])
 ## This is where I don't really know what to do in my case, is this function changing my data into [ GO, data, END] ?

        decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(decoder_units)

        attention_cell = tf.contrib.seq2seq.AttentionWrapper(
            cell = decoder_cell,
            attention_mechanism = attention_mechanism_luong, # Instance of AttentionMechanism
            attention_layer_size = attention_units,
            name="attention_wrapper")

        initial_state = attention_cell.zero_state(batch_size=batch_size, dtype=tf.float32)
        initial_state = initial_state.clone(cell_state=encoder_state)

        decoder = tf.contrib.seq2seq.BasicDecoder(attention_cell, initial_state=initial_state, helper=helper, output_layer=projection)

        outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=decoder)


        # Loss function:

        loss = 0.5*tf.reduce_sum(tf.square(outputs[0] - target), -1)
        loss = tf.reduce_mean(loss, 1)
        loss = tf.reduce_mean(loss)

        # Optimizer

        optimizer = tf.train.AdamOptimizer(learning_).minimize(loss)

I understood that Training state and Inference state are quite different for the Seq2seq architecture but I don't know how to use the Helpers from the module in order to distinguish both. I'm using this module because it's quite useful for Attention Layers. How can I use the Helper in order to create a ['Go' , [input_sequence]] for the decoder ?

Anthony D'Amato
  • 748
  • 1
  • 6
  • 23
  • To clarify, you have continuous data and so it's not clear how the seq2seq architecture applies? Or is this a more concrete API question? – Allen Lavoie Nov 28 '17 at 19:15
  • It's about the API, I don't really get how to use it with continuous data. – Anthony D'Amato Nov 30 '17 at 06:19
  • Sorry for 'upping' this post. But do you have any idea about how using it with data series ? Such as temperature, pollution level etc ... – Anthony D'Amato Jan 30 '18 at 01:28
  • I don't see much issue with what you have there from an API perspective: rather than embedding you just feed real-valued data directly (so input dim replaces embedding dim), and switch up the loss. It looks like you're feeding targets to the `TrainingHelper`, which should get the inputs instead? – Allen Lavoie Jan 31 '18 at 18:04

0 Answers0