tf.nn.rnn_cell.GRUCell were built on CPU device

Question

I'm training a 2-layer seq2seq model now and gru_cell is used.

def create_rnn_cell():
    encoDecoCell = tf.contrib.rnn.GRUCell(emb_dim)
    encoDecoCell = tf.contrib.rnn.DropoutWrapper(
                                                 encoDecoCell,
                                                 input_keep_prob=1.0,
                                                 output_keep_prob=0.7
                                                 )
    return encoDecoCell

encoder_mutil = tf.contrib.rnn.MultiRNNCell(
            [create_rnn_cell() for _ in range(num_layers)],
        )

query_encoder_emb = tf.contrib.rnn.EmbeddingWrapper(
                                        encoder_mutil, 
                                        embedding_classes=vocab_size,                                                              
                                        embedding_size=word_embedding
                                        )

Timeline object is used to get the time of execution for each node in the graph and I found most operations inside GRU_cell (including MatMul) happened on CPU device which made it very slow. I installed the gpu version of tf-1.8. Any comments about this? Did I miss something here? I guess there is something wrong with tf.variable_scope because I'm using different buckets for the training data. This is how I reuse the variable between different bucktes:

for i, bucket in enumerate(buckets):
    with tf.variable_scope(name_or_scope="RNN_encoder", reuse=True if i > 0 else None) as var_scope:
        query_output, query_state = tf.contrib.rnn.static_rnn(query_encoder_emb,inputs=self.query[:bucket[0]],dtype=tf.float32)

execution time screenshot

score 0 · Answer 1 · answered Jul 24 '18 at 11:37

0

I found the problem. In the source code of EmbeddingWrapper, CPU is used. tf.contrib.rnn.EmbeddingWrapper I rewrote this function and now it works on GPU and is much faster. So be careful if you want to use tf.contrib.rnn.EmbeddingWrapper.

answered Jul 24 '18 at 11:37

Ming

1
1

tf.nn.rnn_cell.GRUCell were built on CPU device

1 Answers1