1

I'm trying to implement an LSTM network to play a game with TensorFlow, but I'm having a very hard time putting pieces together (in other words: to go from this example: A Reccurent Neural Network (LSTM) implementation example using TensorFlow library and MNIST toward what my model need). The game can be described as follow:

  • There are 4 different states (s1,s2,s3 and s4). They repeat themselves in cycle through time until the end of the game. Not all of them are necessarily played in a cycle, and a given state can be played several times in succession in a cycle.
    Example of a game (cycle are separated by ||): start: s1,s2,s3 || s1 || s1,s2,s3,s4 || s1,s1,s2,s2,s2,s3 || s1,s2 :end)
  • Each state is represented by a 1D vector. None of the state has the same vector length.
  • The actions available to the agent is the same for the 4 states.
  • The action in a cycle depends on the current state as well as on all previous states in this cycle (no Markov property) and all previous actions taken.

Here is a diagram representing the model I think may fit this game (inspired by «Karpathy, The Unreasonable Effectiveness of RNN, many to many example»): Network's model picture.

Problems (implementation differences with the MNIST example cited above):

  1. How to deal with the different size of input vectors for the different cells? Knowing that the information encoded at a given input vector index does not represent the same stuff compared to the other input vector (true for all input vector).
  2. How to deal with the unknown length of cycle, and the repetition of state that may or may not occur in a cycle?
  • As per my understanding. Your problem is similar to language modelling with variable length sequence. It is done using padding I heard but haven't tried. If that is the case then the below links are useful. http://stackoverflow.com/questions/34670112/tensorflow-rnn-with-varying-length-sentences https://www.reddit.com/r/MachineLearning/comments/3sok8k/tensorflow_basic_rnn_example_with_variable_length/ – Aravind Pilla Jun 07 '16 at 10:06
  • It seems indeed appropriate. Thanks, I will look into it. – P. Philippe Jun 09 '16 at 16:29

0 Answers0