I am currently using a generative RNN to classify indices in a sequence (sort of saying whether something is noise or not noise).
My input in continuous (i.e. a real value between 0 and 1) and my output is either a (0 or 1).
For example, if the model marks a 1 for numbers greater than 0.5 and 0 otherwise,
[.21, .35, .78, .56, ..., .21] => [0, 0, 1, 1, ..., 0]:
0 0 1 1 0
^ ^ ^ ^ ^
| | | | |
o->L1 ->L2 ->L3 ->L4 ->... ->L10
^ ^ ^ ^ ^
| | | | |
.21 .35 .78 .56 ... .21
Using
n_steps = 10
n_inputs = 1
n_neurons = 7
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None, n_steps, n_outputs])
cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons, activation=tf.nn.relu)
rnn_outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
rnn_outputs becomes a (?, 10, 7) shape tensor, presumable 7 outputs per each of the 10 time steps.
Previously, I have run the following snippet on output projection wrapped rnn_outputs
to get a classification label per sequence.
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,logits=logits)
loss = tf.reduce_mean(xentropy)
How would I run something similar on rnn_outputs to get a sequence?
Specifically,
1. Can I get the rnn_output from each step and feed it into a softmax?
curr_state = rnn_outputs[:,i,:]
logits = tf.layers.dense(states, n_outputs)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
2. What loss function should I use and should it be applied across every value of every sequence? (for sequence i and step j, loss = y_{ij} (true) - y_{ij}(predicted)
)?
Should my loss be loss = tf.reduce_mean(np.sum(xentropy))
?
EDIT It seems I am trying to implement something similar to what is similar in https://machinelearningmastery.com/develop-bidirectional-lstm-sequence-classification-python-keras/ in TensorFlow.
In Keras, there's a TimeDistributed
function:
You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently
How would I go about implementing something similar in Tensorflow?