2

I'm facing issues with getting tf.nn.conv2d_transpose to work correctly. Here is a small reproduction of what I'm trying to do:

import tensorflow as tf
import numpy as np

# Shape (2, 3, 3, 1) == (batch_sz, height, width, channels)
inp = tf.Variable(np.array(
    [
        [
            [[1], [2], [3]],
            [[2], [3], [4]],
            [[7], [8], [9]]
        ],
        [
            [[3], [2], [1]],
            [[2], [7], [2]],
            [[3], [2], [0]]
        ]
    ], dtype = np.float32
))
# Shape (5, 5, 3, 1) == (kH, kW, out_channels, in_channels)
ker = tf.Variable(np.array(
    [
        [[[1],[2],[1]], [[2],[2],[2]], [[1],[2],[1]], [[2],[1],[1]], [[1],[1],[1]]],
        [[[1],[2],[1]], [[2],[2],[2]], [[1],[2],[1]], [[2],[1],[1]], [[1],[1],[1]]],
        [[[1],[2],[1]], [[2],[2],[2]], [[1],[2],[1]], [[2],[1],[1]], [[1],[1],[1]]],
        [[[1],[2],[1]], [[2],[2],[2]], [[1],[2],[1]], [[2],[1],[1]], [[1],[1],[1]]],
        [[[1],[2],[1]], [[2],[2],[2]], [[1],[2],[1]], [[2],[1],[1]], [[1],[1],[1]]]
    ], dtype = np.float32
))
out = tf.nn.conv2d_transpose(inp, ker, (2, 7, 7, 1), (1, 1, 1, 1), padding='SAME', data_format='NHWC', name='conv_transpose')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output, kernel, input = sess.run([out, ker, inp])

What I want is to perform a transposed convolution on a 3x3x1 input using three 5x5x1 filters. I expect the output to be of shape 7x7x3 - but instead, I get an error saying:

InvalidArgumentError: Conv2DCustomBackpropInput: input and filter must have the same depth
     [[Node: conv_transpose_2 = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](conv_transpose_2/output_shape, Variable_21/read, Variable_20/read)]]

Aren't the input and filter depth both equal to 1? I don't see what I'm doing wrong - any hints would be really appreciated. I specifically want to use tf.nn.conv2d_transpose and not tf.layers.conv2d_transpose.

narrkey
  • 21
  • 1
  • 2
  • Have you expanded the 0'th and 3'th dimension of your arrarys? – marcopah May 19 '18 at 15:39
  • Sorry for sounding like a noob. No, I have not. Why would I need to do that? Are my arrays not already in the format required by conv2d_transpose - [batches, in_height, in_width, channels] for input and [filter_height, filter_width, out_channels, in_channels] for the filters --> Docs: https://www.tensorflow.org/api_docs/python/tf/nn/conv2d_transpose – narrkey May 19 '18 at 15:44

1 Answers1

0

This issue is similar to this Stack Overflow Issue.

You should do the below changes for your code to run.

  1. Shape of inp should be (2, 3, 3, 3) instead of (2, 3, 3, 1)
  2. Shape of ker should be (5,5,1,3) instead of (5,5,3,1)
  3. Padding should be set to 'VALID' instead of 'SAME' only then Output Shape will be different from Input Shape.

Mentioned below is the working code (did it with zeroes for ease of implementation):

import tensorflow as tf
import numpy as np

# Shape (2, 3, 3, 3) == (batch_sz, height, width, channels)
inp = tf.Variable(np.array(np.zeros((2, 3, 3, 3)) , dtype = np.float32))

# Shape (5, 5, 3, 1) == (kH, kW, out_channels, in_channels)
ker = tf.Variable(np.zeros((5,5,1,3)) , dtype = np.float32)

out = tf.nn.conv2d_transpose(value = inp, filter = ker, output_shape=(2, 7, 7, 1), 
                             strides=(1, 1, 1, 1), padding='VALID', data_format='NHWC', name='conv_transpose')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output, kernel, input = sess.run([out, ker, inp])